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Abstract 

Starting with Einstein's famous papers of 1905, we review some of the ensu- 
ing developments and their impact on present-day physics. We attempt to cover 
topics that are of interest to historians and philosophers of science as well as to 
physicists. 
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the special March 2006 issue of Studies in History and Philosophy of Modern 
Physics. 
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1 Introduction 



The assignment we were given for this article was to describe the impact of Einstein's 
work on 20th-century physics. This formulation of our task is somewhat problematic 
given that a sizable fraction of 20th-century physics is Einstein's work and most of the 
rest is more or less directly connected to it. Hence Einstein's impact definitely cannot 
be treated perturbatively. In fact, it would have been much easier to write about those 
developments of 20th-century physics that were not connected to the work of Einstein. 
But who would want to read or write that? 

Einstein's major, enduring contributions to physics were made during the first quar- 
ter of the 20th century. They can roughly be divided into four main branches: (1) sta- 
tistical physics, (2) early quantum theory of light and matter, (3) Special Relativity, 
and (4) General Relativity (theory of spacetime and gravitation). Our article is struc- 
tured accordingly, in that we will write about each branch in turn. We regret not being 
able to include material on present-day attempts to reconcile General Relativity with 
Quantum Field Theory, but that would have added another 20 pages or so to an already 
fairly lengthy article. 

Some topics we write about seem (to us) mandatory, others are chosen according 
to personal prejudices and/or predilections. Sometimes much more could have been 
said, whereas in other places less detail would have sufficed to give a first impression. 
For several reasons we decided against keeping the discussions at a constant technical 
level. In some cases we put more emphasis on the historical context, in others we 
chose to display some technical details. As far as the latter are concerned, we feel 
that it is important not just to recount the greatness of Einstein's thoughts, but also to 
put some flesh on these thoughts to see this greatness taking on a definite shape. In 
any case, we wanted to avoid letting the discussion degenerate into a sterile succession 
of "statements of affairs". In addition, we hope to address physicists with interest in 
the history and philosophy of their science as well as historians and philosophers of 
science with an interest in physics proper. This has called for various compromises. 
We hope to have found a readable and enjoyable balance, being well aware of Ein- 
stein's dictum: "Wer es unternimmt, auf dem Gebiet der Wahrheit und der Erkenntnis 
als Autoritat aufzutreten, scheitert am Gelachter der Gotter" (Einstein, 1977, p. 106). 1 

2 Einstein and statistical physics 
2.1 A brief survey 

When Einstein's great papers of 1905 appeared in print, he was not a newcomer to the 
Annalen der Physik, in which he published most of his early work. Of crucial impor- 
tance for his further research were three early papers on the foundations of statistical 
mechanics, in which he tried to fill what he considered to be a gap in the mechanical 
foundations of thermodynamics. When Einstein wrote his three papers he was not fa- 
miliar with the work of Gibbs and only partially with that of Boltzmann. Einstein's 
papers, like Gibbs's Elementary Principles of Statistical Mechanics of 1902, form a 

1 "He who endeavors to present himself as an authority in matters of truth and cognition, will be wrecked 
by the laughter of the gods". The original german text first appeared in (Einstein, 1952). 
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bridge between Boltzmann's work and the modern approach to statistical mechanics. 
In particular, Einstein independently formulated the distinction between the micro- 
canonical and canonical ensembles and derived the equilibrium distribution for the 
canonical ensemble from the microcanonical distribution. Of special importance for 
his later research was the derivation of the energy-fluctuation formula for the canonical 
ensemble. 

Einstein's profound insight into the nature and size of fluctuations played a decisive 
role for his most revolutionary contribution to physics: the light-quantum hypothesis. 
Indeed, Einstein extracted the light-quantum postulate from a statistical-mechanical 
analogy between radiation in the Wien regime 2 and a classical ideal gas of material 
particles. In this consideration Boltzmann's principle, relating entropy and probability 
of macroscopic states, played a key role. Later Einstein extended these considerations 
to an analysis of energy and momentum fluctuations in the radiation field. For the 
latter he also drew on ideas and methods he had developed in the course of his work on 
Brownian motion, another beautiful application of fluctuation theory. This definitively 
established the reality of atoms and molecules, and, more generally, provided strong 
support for the molecular-kinetic theory of thermodynamics. 

Fluctuations also played a prominent role in Einstein's beautiful work on critical 
opalescence. Many years later he applied this magic wand once more to gases of 
identical particles, satisfying the Bose-Einstein statistics. With this work in 1924 he 
extended the particle-wave duality for photons to massive particles. It is well-known 
that Schrodinger was strongly influenced by this profound insight (see below). 

2.2 Foundations of statistical mechanics 

Already as a student Einstein was very interested in thermodynamics and kinetic the- 
ory, and he intensively studied some of Boltzmann's work. As he wrote on September 
13, 1900 to Mileva Marie: 

"The Boltzmann is absolutely magnificent. I'm almost finished with it. 
He's a masterful writer. I am firmly convinced of the correctness of the 
principles of the theory, i.e., I am convinced that in the case of gases, we 
are really dealing with discrete mass points of definite finite size which 
move according to certain conditions. Boltzmann quite correctly em- 
phasizes that the hypothetical forces between molecules are not essential 
components of the theory, as the whole energy is essentially kinetic in 
character. This is a step forward in the dynamic explanation of physical 
phenomena." (CPAE, Vol. 1, Doc. 75; translation from Renn and Schul- 
mann, 1992, 32) 

For further details on this incubation period, we refer to (CPAE, Vol. 2, editorial note, 
p. 41). 

The first of Einstein's three papers on the foundations of statistical mechanics was 
submitted to the Annalen in June 1902. One can only be astonished about the self- 
assurance with which the 23-year-old approaches the fundamental problems. His aim 
is clearly described in the opening section: 

2 The 'Wien regime' corresponds to high frequency and/or low temperature, such that hv S> kT, where 
h and k are Planck's and Boltzmann's constants respectively. 
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"Great as the achievements of the kinetic theory of heat have been in the 
domain of gas theory, the science of mechanics has not yet been able to 
produce an adequate foundation for the general theory of heat, for one 
has not yet succeeded in deriving the laws of thermal equilibrium and 
the second law of thermodynamics using only the equations of mechanics 
and the probability calculus, though Maxwell's and Boltzmann's theories 
came close to this goal. The purpose of the following considerations is 
to close this gap. At the same time, they will yield an extension of the 
second law that is of importance for the application of thermodynamics. 
They will also yield the mathematical expression for entropy from the 
standpoint of mechanics." (CPAE, Vol. 2, Doc. 3, p. 57) 3 

This is not the place to describe the detailed content of the three papers (for this we 
refer again to the editorial note in CPAE, Vol. 2 mentioned above). The third one 
begins with a brief polished summary of the two preceding ones, including several 
improvements. Then Einstein proceeds to a discussion of the "general significance of 
the constant k", by deriving the energy-fluctuation formula in the canonical ensemble. 
He comments: 

"Thus the absolute constant k determines the thermal stability of the sys- 
tem. The relationship just found is interesting because it no longer con- 
tains any quantity reminiscent of the assumption on which the theory is 
based." (CPAE, Vol. 2, Doc. 5, p. 105) 

In the final section of the paper Einstein applies his fluctuation formula to black-body 
radiation, a theme that would soon lead him to his light-quantum hypothesis. 



2.3 Applications of the classical theory 

In his dissertation A new determination of molecular dimensions, the second of the 
five papers of 1905, Einstein derived a new formula for the diffusion constant D of 
suspended microscopic particles. 4 This formula is obtained on the basis of thermal 
and dynamical equilibrium conditions, making use of van't Hoff 's law for the osmotic 
pressure and Stokes' law for the mobility of a particle. The result — obtained almost 
simultaneously by Sutherland — reads 



67rr/a' 

where 77 is the viscosity of the fluid and a the radius of the particles (assumed to be 
spherical). 

3 We refer to The Collected Papers of Albert Einstein (CPAE) for all papers that have meanwhile ap- 
peared in this edition. Translations are taken from the companion volumes to the documentary editions. 

4 The main body of the paper is devoted to the derivation of a relation between the coefficients of 
viscosity of a liquid with and without suspended particles. Einstein applied this relation, together with 
the diffusion formula, to the case of sugar being dissolved in water. Using empirical data he got (after 
eliminating a calculational error) an excellent value of the Avogadro number and an estimate of the 
size of sugar molecules. For its wide range of applications Einstein's dissertation was by far the most 
cited of all of his papers around time when the Einstein biography by Pais (1982) appeared. It probably 
still is. For a recent detailed discussion of Einstein's dissertation, see Straumann (2005). 
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Brownian motion 



This formula soon came to play an important role in Einstein's work on Brownian 
motion. In this celebrated paper he first gives a statistical mechanical derivation of 
the osmotic pressure, and then repeats his earlier derivation of (0- In the short novel 
part of the paper he considers the diffusion alternatively as the result of a highly ir- 
regular random motion, caused by the bombardment of an enormously large number 
of molecules. On the basis of some idealizing assumptions, he shows that the random 
walks of the suspended particles can be described by a Gaussian process, "which was 
to be expected" (CPAE, Vol. 2, Doc. 16, p. 234). Moreover, the width of the proba- 
bility distribution for the position of a particle is determined by the diffusion constant. 
Therefore, the one-dimensional variance of the position is given by the famous formula 



All this is so well-known that no further explanations are necessary. It may, how- 
ever, be appropriate to recall the following sentences of the introductory part of Ein- 
stein's paper, which clearly express what he considered to be important. 

"If it is really possible to observe the motion to be discussed here, along 
with the laws it is expected to obey, then classical thermodynamics can 
no longer be viewed as strictly valid even for microscopically distinguish- 
able spaces, and an exact determination of the real size of atoms becomes 
possible. Conversely, if the prediction of this motion were to be proven 
wrong, this fact would provide a weighty argument against the molecular- 
kinetic conception of heat." (CPAE, Vol. 2, Doc. 16, p. 224) 

Critical opalescence 

A letter of Einstein to his collaborator Jacob Laub from August 27, 1910 (CPAE, 
Vol. 5, Doc. 224) shows his enthusiasm about his work on critical opalescence, yet 
another application of the theory of statistical fluctuations. This was Einstein's last 
contribution to classical statistical mechanics, and the corresponding measurements 
were soon carried out. 

Since about 1874 it was known that the scattering and attenuation of light passing 
through gas becomes very large near the critical point. 5 In 1908 Marian von Smolu- 
chowski pointed out that this phenomenon is the result of density fluctuations of the 
medium, but he did not derive a quantitative formula for the scattering or extinction 
coefficient. Einstein set out to close this gap (CPAE, Vol. 3, Doc. 9). 

Before he does, Einstein gives a lengthy introduction to the theory of statistical 
fluctuations based on Boltzmann's principle. He then applies the general theory to 
density fluctuations of fluids and mixtures of fluids. This opening section is a major 
and influential contribution to statistical thermodynamics. 

In the fourth section Einstein begins with the electrodynamic part of the problem 
and derives the well-known formula for the scattering coefficient, which has long be- 
come standard text-book material. If the refraction index n is close to 1 , this coefficient 

5 The point at which the partial derivative of the pressure with respect to the volume at constant temper- 
ature vanishes, i.e., (dp/dV)r = 0. 




(2) 
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reduces to 

where uj is the angular frequency of the light. With this formula Einstein had found a 
quantitative relationship between Rayleigh scattering and critical opalescence. 

At the critical point this expression diverges, because the correlation length for the 
density fluctuations diverges. As was first pointed out by Ornstein and Zernicke, Ein- 
stein's implicit assumption of statistical independence in separated volume elements is 
then no longer valid. In this sense, Einstein's work on critical opalescence became the 
starting point of several research directions of the twentieth century. 

2.4 Post-Einstein developments 

Einstein did not have a dynamical theory of Brownian motion; he determined the na- 
ture of the motion on the basis of some assumptions. Another derivation was later 
given by Langevin, who separated the force on a suspended particle into ordered and 
disordered parts. Through this work he became the founder of the theory of stochastic 
differential equations. His approach was the starting point of the work of Ornstein 
and Uhlenbeck, which we shall briefly discuss below. Before doing this, however, we 
want to point out that Einstein's heuristic considerations, which have been criticized 
by many people (including Einstein himself), are tantamount to assumption (iii) of the 
following theorem. 

Theorem. Let X t (0 < t < oo) be a stochastic process, satisfying the properties: 

(i) Independence: Each increment X t +At — X t is independent of{X T , r < t}. 

( ii) Stationarity: The distribution of X t +At — X t does not depend on t. 

( iii) Continuity: If P denotes the probability measure belonging to the stochastic 
process, then 

]im P({\x, +M -x t \>s}) = >0 

At|0 At 



(iv) 

X t=0 = 0. (5) 

Then X t has a normal distribution with (X t ) = and (X 2 ) = a 2 t, where a is a 
numerical constant. 

For a proof, see Ch. 12 in (Breimann, 1968); see also Theorem 5.5 in Nelson (1967). 

Einstein's theory of Brownian motion is highly idealized, since for example the 
velocity of a particle is not defined. Langevin's approach, perfected by Ornstein and 
Uhlenbeck (Uhlenbeck, 1930), is closer to Newtonian particle mechanics and is thus 
truly dynamical. In practice, for 'ordinary' Brownian motion, the predictions of the 
two theories are numerically indistinguishable. 

In the Ornstein-Uhlenbeck theory the velocity process V t is described in terms of 
the stochastic differential equation (Langevin equation) 

V t = -aV t + att , (6) 
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where £ t denotes 'white noise'. (The exact meaning of this equation is described in 
every book on stochastic differential equations.) 

Let us state a few important results that can be derived from the basic equation ©. 

a) The distribution of V t converges for large t to a Gaussian distribution with mean 
zero and variance a 2 /2a. Because of the equipartition theorem of statistical 
mechanics it is, therefore, natural to set \m[a 2 /2a) = \kT (where m is the 
mass of the particle). The dissipation a thus induces a fluctuation 

a 2 = ^kT. (7) 

m 

b) The distributions of the positions X t converge for large t to those of the Gaussian 
process 

B t = X + V2DB t , (8) 

where B t is the Brownian (Wiener) process with variance 1, Xq the initial posi- 
tion of the particle, and 

2a ma 
The distribution function of X t is thus, 

Pt (x) = -7=L=e-^ Dt , (10) 
\fAirDt 

and hence satisfies the diffusion equation 

d tPt - Dd 2 xPt = o. (ii) 

Therefore, D is the diffusion constant. According to equation (J9ji it is given by 
the Einstein value Q, if we also use Stokes' law for a. 

The theory of stochastic differential equations has expanded into a huge field of 
stochastic analysis, with rich applications in physics, engineering, and mathematical 
finance. In quantum physics (generalized) stochastic processes have become very im- 
portant through Feynman-Kac path integral representations. We briefly recall a simple 
example of such a formula. 

Consider on L 2 (M n ) the Schrodinger operator 

H = ~A + V. (12) 

Under certain conditions for the potential V, the operator is self-adjoint and the fol- 
lowing Feynman-Kac formula holds for each t > and i/j G L 2 : 

(e- tH ^p) (x) = /exp (- J V(x + B s ) ds\ ^{x + B t )\ , (13) 

almost everywhere in x. The expectation value on the right-hand side is taken with the 
probability measure belonging to the Brownian process B t . Such representations have 
many applications (see, e.g., Simon, 1979). 
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In modern quantum field theory, path (functional) integral representations play a 
crucial role. For gauge theories they are indispensable. A general remarkable fact, first 
pointed out by Feynman, is that the Euclidean formulation of quantum field theory in 
terms of functional integrals establishes a close connection with classical (!) statistical 
mechanics (models of magnetism). All this has by now become standard text-book 
material (see, e.g., Roepsdorff, 1996). 

3 Einstein's contributions to quantum theory 
3.1 Einstein's first paper from 1905 

We begin by briefly reviewing the line of thought of the March paper (CPAE, Vol. 2, 
Doc. 14) about which Res Jost said in 1979: "Without this paper the development of 
physics in our century is unthinkable" (Jost, 1995, p. 79). In the first section Einstein 
emphasizes that classical physics inevitably leads to a nonsensical energy distribution 
for black-body radiation, but that the spectral distribution, p(T, v), must approximately 
be correct for large wavelengths and radiation densities (classical regime). 6 Applying 
the equipartition theorem for a system of resonators (harmonic oscillators) in thermal 
equilibrium, he independently found what is now known as the Ray leigh- Jeans law: 
p(v,T) = (8ttv 2 /c 3 )kT. Einstein stresses that this law "not only fails to agree with 
experience (...), but is out of question" (CPAE, Vol. 2, Doc. 14, p. 154) because it im- 
plies a diverging total energy density (ultraviolet catastrophe). In the second section 
he then states that the Planck formula, "which has been sufficient to account for all 
observations made so far" (ibid., p. 154) agrees with the classically derived formula in 
the mentioned limiting domain for the following value of Avogadro's number 

N A = 6.17 x 10 23 . (14) 

This relation was already found by Planck, albeit not via a correspondence argument. 
Planck relied on the strict validity of his formula and the assumptions used in its deriva- 
tion. Einstein's correspondence argument now showed "that Planck's determination of 
the elementary quanta is to some extent independent of his theory of black-body radi- 
ation" (ibid., p. 155). Indeed, Einstein understood from first principles exactly what he 
did. A similar correspondence argument was used by him more than ten years later in 
his famous derivation of Planck's formula (more about this later). Einstein concludes 
these considerations with the following words: 

"The greater the energy density and the wavelength of the radiation, the 
more useful the theoretical principles we have been using prove to be; 
however, these principles fail completely in the case of small wavelengths 
and small radiation densities." (CPAE, Vol. 2, Doc. 14, p. 155) 

Einstein now begins to analyze what can be learned about the structure of radiation 
from the empirical behavior in the Wien regime, i.e., from Wien's radiation formula 
for the spectral energy-density 

p(T,u) = ^-hue- hv ^. (15) 

6 This is, to our knowledge, the first proposal of a 'correspondence argument' , which is of great heuristic 
power, as we will see. 
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Let Ey(T,v) be the energy of radiation contained in the volume V and within the 
frequency interval [u , v + Av\ {Av small), i.e., 



E V (T, v) = p{T,v)VAv. 



(16) 



and, correspondingly, Sy(T,v) 
implies 



da _ 1 
~dp~T' 

Solving (TT31 for l/T and inserting this into (fT7b gives 



a(T, v) V Av for the entropy. Thermodynamics 

(17) 



Integration yields 



da k 
dp hv 



Sy = —k-r^- { In 
hv 



P 



Ey 



VAv8nhis 3 /c 3 



1 



(18) 



(19) 



In his first paper on this subject, Einstein focused his attention on the volume depen- 
dence of the entropy of the radiation as given by this expression. Fixing the amount of 
energy, E = Ey, one obtains 



Sv ~ Sv ° = k i; ln (y 



( V \ E/hu 



(20) 



So far only thermodynamics has been used. Now Einstein introduces what he 
calls Boltzmann's principle, which was already of central importance in his papers on 
statistical mechanics. According to Boltzmann, the entropy S of a system is connected 
with the number of possibilities W, by which a macroscopic state can microscopically 
be realized, through the relation 



S = k\nW . 



(21) 



In a separate section Einstein recalls this fundamental relation between entropy and 
"statistical probability" (Einstein's terminology) before applying it to an ideal gas of 
N particles in volumes V and Vq, respectively. For the relative probability of the two 
situations one has 



N 



and hence for the entropies 

S{V,T) - S(V ,T) = kN In 



(22) 



(23) 



For the relative entropies d20l of the radiation field, Boltzmann's principle (121 b now 

gives 

/ V \ E/hv 

From the striking similarity between d22l) and d24l) Einstein concludes: 
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"Monochromatic radiation of low density (within the range of Wien's ra- 
diation formula) behaves thermodynamicaily as if it consisted of mutually 
independent energy quanta of magnitude ^-v" (CPAE, Vol. 2, Doc. 14, 
p. 161) 

Here R/3/N corresponds to h. So far no revolutionary statement has been made. The 
famous sentences just quoted express the result of a statistical-mechanical analysis. 

Light quantum hypothesis 

Einstein's bold step consists in a statement about the quantum properties of the free 
electromagnetic field that was not accepted for a long time by anybody else. He for- 
mulates his heuristic principle as follows (where we replaced his R(3/N by h): 

"If, with regard to the dependence of its entropy on volume, a monochro- 
matic radiation (of sufficient low density) behaves like a discontinuous 
medium consisting of energy quanta of magnitude hu, then it seems 
reasonable to investigate whether the laws of generation and conversion 
of light are so constituted as if light consisted of such energy quanta." 
(CPAE, Vol. 2, Doc. 14, p. 143-144) 

In the final two sections, Einstein applies this hypothesis first to an explanation 
of Stokes' rule for photoluminescence and then turns to the photoelectric effect. One 
should be aware that in those days only some qualitative properties of this phenomenon 
were known. Therefore, Einstein's well-known linear relation between the maximum 
kinetic energy of the photoelectrons (E max ) and the frequency of the incident radiation, 

E mSLX = his-P, (25) 

was a true prediction. Here P is the work-function of the metal emitting the electrons, 
which depends on the material in question but not on the frequency of the incident 
light. It took almost ten years until this was experimentally confirmed by Millikan, 
who then used it to give a first precision measurement of h (slope of the straight line 
given by d25t in the v-E maiX plane) at the 0.5 percent level (Millikan 1916). Strange 
though understandable, not even Millikan, 7 who spent 10 years on the brilliant exper- 
imental verification of its consequence d25b . could believe in the fundamental correct- 
ness of Einstein's hypothesis. In his comprehensive paper on the determination of h, 
Millikan first commented on the light-quantum hypothesis: 

"This hypothesis may well be called reckless, first because an electro- 
magnetic disturbance which remains localized in space seems a violation 
of the very conception of an electromagnetic disturbance, and second be- 
cause it flies in the face of the thoroughly established facts of interfer- 
ence." (Millikan, 1916, p. 355) 

7 Others who strongly opposed Einstein's idea, or at least openly stated disbelief, included Planck (com- 
pare footnote[8}), Sommerfeld, vonLaue, Lorentz and Bohr. As late as 1922, in his Nobel Lecture, 
Bohr (1922, p. 14) stated that "In spite of its heuristic value, however, the hypothesis of light quanta, 
which is quite irreconcilable with so-called interference phenomena, is not able to throw light on the 
nature of radiation." Bohr's critical attitude culminated in his famous joint paper of Bohr, Kramers, and 
Slater (1924); see, e.g., Section lid in Pais (1991) for more background information on this fascinating 
episode. 
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And after reporting on his successful experimental verification of Einstein's equation 
d25t and the associated determination of h, Millikan concludes: 

"Despite the apparently complete success of the Einstein equation, the 
physical theory of which it was designed to be the symbolic expression is 
found so untenable that Einstein himself, I believe, no longer holds to it." 
(Millikan, 1916, p. 384) 

It should be stressed that Einstein's bold light quantum hypothesis was very far from 
Planck's conception. Planck neither envisaged a quantization of the free radiation field, 
nor did he, as is often stated, quantize the energy of a material oscillator per se. What 
he was actually doing in his decisive calculation of the entropy of a harmonic oscillator 
was to assume that the total energy of a large number of oscillators is made up of finite 
energy elements of equal magnitude hv. He did not propose that the energies of single 
material oscillators are physically quantized. 8 Rather, the energy elements hv were 
introduced as a formal counting device that could at the end of the calculation not be 
set to zero, for, otherwise, the entropy would diverge. It was Einstein in 1906 who 
interpreted Planck's result as follows (again writing h for R(3/N): 

"Hence, we must view the following proposition as the basis underlying 
Planck's theory of radiation: The energy of an elementary resonator can 
only assume values that are integral multiples of hv; by emission and ab- 
sorption, the energy of a resonator changes by jumps of integral multiples 
of hv." (CPAE, Vol. 2, Doc. 34, p. 353) 



3.2 Energy and momentum fluctuations of the radiation field 

In his paper "On the present status of the radiation problem" of 1909 (CPAE, Vol. 2, 
Doc. 56), Einstein returned to the considerations discussed above, but extended his 
statistical analysis to the entire Planck distribution. First, he considers the energy 
fluctuations, and re-derives the general fluctuation formula he had already found in the 
third of his statistical-mechanics articles. This implies for the variance of Ey in d 1 6b : 

((E V - {Ev))*) = kT*^ = fcTVA,| . (26) 
For the Planck distribution this gives 

({E v - {E v )f) = [hup + ^2 P 2 ) VAu . (27) 

Einstein shows that the second term within the parentheses of this most remarkable 
formula, which dominates in the Rayleigh- Jeans regime, can be understood with the 

8 In 1911 Planck even formulated a 'new radiation hypothesis', in which quantization only applies to 
the process of light emission but not to that of light absorption (Planck 1911). Planck's explicitly 
stated motivation for this was to avoid an effective quantization of oscillator energies as a result of 
quantization of all interaction energies. It is amusing to note that this new hypothesis led Planck to a 
modification of his radiation law, which consisted in the addition of the temperature-independent term 
hv/2 to the energy of each oscillator, thus corresponding to the oscillator's energy at zero temperature. 
This seems to be the first appearance of what soon became known as 'zero-point energy'. 
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help of the classical wave theory as due to interference between partial waves. The 
first term, dominating in the Wien regime, is thus in obvious contradiction to classical 
electrodynamics. It can, however, be interpreted by analogy to the fluctuations of 
the number of molecules in ideal gases, and thus represents a particle aspect of the 
radiation in the quantum domain. 

Einstein confirms this particle-wave duality, at this time a genuine theoretical co- 
nundrum, by considering momentum fluctuations. For this he considers the Brownian 
motion of a mirror that perfectly reflects radiation in a small frequency interval, but 
transmits radiation of all other frequencies. About the final result he writes: 

"The close connection between this relation and the one derived in the 
last section for the energy fluctuation is immediately obvious, and ex- 
actly analogous considerations can be applied to it. Again, according to 
the current theory, the expression would be reduced to the second term 
(fluctuations due to interference). If the first term alone were present, the 
fluctuations of the radiation pressure could be completely explained by 
the assumption that the radiation consists of independently moving, not 
too extended complexes of energy hu ." (CPAE, Vol. 2, Doc. 56, p. 547) 

Einstein also discussed these issues in his famous Salzburg lecture (CPAE Vol. 2, 
Doc. 60) at the 81st Meeting of German Scientists and Physicians in 1909. Pauli (1949) 
once said that this report can be regarded as a turning point in the development of 
theoretical physics. In this lecture, Einstein treated the theory of relativity and quantum 
theory and pointed out important interconnections between his work on the quantum 
hypothesis, on relativity, on Brownian motion, and statistical mechanics. Already in 
the introductory section he says prophetically: 

"It is therefore my opinion that the next stage in the development of the- 
oretical physics will bring us a theory of light that can be understood as a 
kind of fusion of the wave and emission theories of light." (CPAE, Vol. 2, 
Doc. 60, p. 564-565) 

We now know that it took almost twenty years until this was achieved by Dirac in his 
quantum theory of radiation. 

Specific heat of solids 

In 1907 Einstein used his understanding of black-body radiation to develop a theory for 
the specific heat of solids (CPAE Vol. 2, Doc. 38). He starts by showing that Planck's 
radiation law can be derived within statistical mechanics by restricting the state sum of 
the oscillators to quantized energies, and obtains for the average energy of an oscillator 
the expression hv j(e hv l kT — 1). An interesting methodological aspect of his first paper 
on this subject is that Einstein for the first time works with the canonical ensemble. He 
repeatedly came back to the subject, in particular at the Solvay Congress in 1911, 
when measurements by Nernst were available. Shortly afterwards, Born and Karman 
and independently Debye developed the theory that has become standard. 
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3.3 Derivation of the Planck distribution 

A peak in Einstein's endeavor to extract as much information as possible about the 
nature of radiation from the Planck distribution is his paper "On the Quantum Theory 
of Radiation" of 1916 (CPAE, Vol. 6, Doc. 38). In the first part he gives a derivation of 
Planck's formula which has become part of many textbooks on quantum theory. Ein- 
stein was very pleased by this derivation, about which he wrote on August 11, 1916 to 
Besso: "An amazingly simple derivation of Planck's formula, I should like to say the 
derivation" (CPAE, Vol. 8, Doc. 250). In this derivation he added the hitherto unknown 
process of induced emission, 9 to the familiar processes of spontaneous emission and 
induced absorption. For each pair of energy levels he described the statistical laws 
for these processes by three coefficients (the famous A- and 5-coefficients) and estab- 
lished two relations between these coefficients on the basis of his earlier correspon- 
dence argument in the classical Rayleigh- Jeans limit and Wien's displacement law. In 
addition, the latter implies that the energy difference e n — e m between two internal 
energy states of the atoms in equilibrium with thermal radiation has to satisfy Bohr's 
frequency condition: e n — e m = hv nm . In Dirac's 1927 radiation theory these results 
follow — without any correspondence arguments — from first principles. 

In the second part of his fundamental paper, Einstein discusses the exchange of 
momentum between atoms and radiation by making use of the theory of Brownian 
motion. Using a truly beautiful argument he shows that in every elementary process of 
radiation, and in particular in spontaneous emission, an amount hv/c of momentum 
is emitted in a random direction and that the atomic system suffers a corresponding 
recoil in the opposite direction. This recoil was first experimentally confirmed in 1933 
by showing that a long and narrow beam of excited sodium atoms widens up after 
spontaneous emissions have taken place (Frisch, 1933). Einstein's paper ends with the 
following remarkable statement concerning the role of "chance" in his description of 
the radiation processes by statistical laws, to which Pauli (1949) drew special attention: 

"The weakness of the theory lies, on the one hand, in the fact that it does 
not bring us any closer to a merger with the undulatory theory, and, on 
the other hand, in the fact that it leaves the time and direction of elemen- 
tary processes to 'chance'; in spite of this I harbor full confidence in the 
trustworthiness of the path entered upon." (CPAE, Vol. 6, Doc. 38, p. 396) 

3.4 Bose-Einstein statistics for degenerate material gases 

The last major contributions of Einstein to quantum theory were stimulated by 
deBroglie's suggestion that material particles also have a wave aspect, and Bose's 
derivation of Planck's formula, which only made use of the picture of light as particles, 
albeit particles satisfying a new statistics on account of their indistinguishability. Ein- 
stein (1924, 1925a, 1925b) applied Bose's statistics for photons to degenerate gases of 
identical massive particles. With this 'Bose-Einstein statistics', he obtained a new law, 
to become known as the Bose-Einstein distribution. As with radiation, Einstein con- 
sidered fluctuations in these gases and found both particle-like and wave-like aspects. 

9 Einstein's derivation shows that without assuming a non-zero probability for induced emission one 
would necessarily arrive at Wien's instead of Planck's radiation law. 
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This time the wave property was the novel feature that was recognized by Einstein to 
be necessary 

In the course of this work on quantum gases, Einstein discovered the condensation 
of such gases at low temperatures. (Although Bose made no contributions to this, one 
nowadays speaks of Bose-Einstein condensation.) Needless to say that this subject has 
become enormously topical in recent years. 

In his papers on wave mechanics, Schrodinger acknowledged the influence of Ein- 
stein's gas theory, which from today's perspective appear to be his last great construc- 
tive contribution to physics proper. In the article in which Schrodinger establishes the 
connection of matrix and wave mechanics, he remarks in a footnote: "My theory was 
inspired by L. de Broglie and by brief but infinitely far-seeing remarks of A. Einstein 
[1925a, p. 9 ff.]" (Schrodinger, 1926, p. 735). 

It is well-known that Einstein considered the 'new' quantum mechanics to be un- 
satisfactory until the end of his life. In his autobiographical notes, for example, he 
writes: 

"I believe, however, that this theory offers no useful point of departure for 
future developments. This is the point at which my expectation departs 
most widely from that of contemporary physicists." (Einstein, 1979, p. 83) 

3.5 Light quanta after 1925 

In his contribution to one of the foundational papers on matrix mechanics (Born & 
Jordan 1925), Pascual Jordan made it clear that the quantum-interpretation of physical 
observables must apply to the electromagnetic field as well. He elaborated on this in 
the extended final section of the Dreimilnnerarbeit by Born, Heisenberg and Jordan. 
In particular, Jordan derived Einstein's fluctuation formula d27l from a description 
of the cavity radiation as an infinite set of uncoupled harmonic oscillators, quantized 
according to the rules of matrix mechanics. 10 With this and later investigations, partly 
in collaboration with other authors (Klein, Wigner, Pauli), Jordan is not only one of 
the creators of quantum mechanics, but also one of the founding fathers of quantum 
field theory. 1 1 

After Jordan Dirac was the first to address, in the fall of 1926, the quantum- 
theoretic description of the electromagnetic field. In this seminal work he treated 
for the first time the quantized electromagnetic field in interaction with atomic mat- 
ter described by non-relativistic wave mechanics. Treating the coupled system in first 
order perturbation theory, he obtained directly — without the use of correspondence 
arguments — Einstein's rules for emission and absorption of light. As Gregor Wentzel 
wrote in an article on the early history of quantum field theory in the memorial volume 
for Wolfgang Pauli, 

"Today, the novelty and boldness of Dirac's approach to the radiation 
problem may be hard to appreciate. During the preceding decade it had 
become a tradition to think of Bohr's correspondence principle as the 
supreme guide in such questions, and, indeed, the efforts to formulate this 

10 This was inspired by earlier work of Ehrenfest (1906) and Debye (1910). 
" For biographical notes we refer to Sec. 1.2 of (Schweber, 1994). 
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principle in a quantitative fashion had led to the essential ideas prepar- 
ing the eventual discovery of matrix mechanics by Heisenberg. A new 
aspect of the problem appeared when it became possible, by quantum me- 
chanical perturbation theory to treat atomic transitions induced by given 
external wave fields, e.g., the photoelectric effect. The transitions so cal- 
culated could be interpreted as being caused by absorptive processes, but 
the "reaction on the field", namely the disappearance of a photon, was not 
described by the theory, nor was there any possibility, in this framework, 
of understanding the process of spontaneous emission. Here, the corre- 
spondence principle still seemed indispensable, a rather foreign element 
(a "magic wand" as Sommerfeld called it) in this otherwise very coherent 
theory. At this point, Dirac's explanation in terms of the q matrix came 
as a revelation. Known results were re-derived, but in a completely uni- 
fied way. The new theory stimulated further thinking about application of 
quantum mechanics to electromagnetic and other fields." (Wentzel, 1960, 
p. 49) 

In Dirac's theory the dual particle/wave aspects of radiation are described in a 
coherent, logically consistent manner. The shortcomings of the theory, however, were 
immediately pointed out by Ehrenfest and others. Since the interaction terms contain 
the vector potential at the position of the point-like electron, the theory would lead to 
infinities in higher-order perturbation theory. In particular, the self-energy of a free or 
bound electron turned out to be infinite. Because of these divergence difficulties most 
theorists working on problems in quantum electrodynamics problems in those early 
days had little faith in the theory. In Sec. 14.51 we shall take up this subject again and 
sketch the further developments of relativistic quantum field theory. 

3.6 Einstein and the interpretation of quantum mechanics 

The new generation of young physicists who participated in the tumultuous three-year 
period from January 1925 to January 1928 deplored Einstein's negative judgement of 
quantum mechanics. In the article on Einstein's contributions to quantum mechanics 
cited above, Pauli expressed the disappointment of his contemporaries: 

"The writer belongs to those physicists who believe that the new episte- 
mological situation underlying quantum mechanics is satisfactory, both 
from the standpoint of physics and from the broader knowledge in gen- 
eral. He regrets that Einstein seems to have a different opinion on this 
situation (...)." (Pauli, 1949, p. 149) 

When the Einstein-Podolsky-Rosen (EPR) paper (Einstein etal. 1935) appeared, 
Pauli's immediate reaction in a letter to Heisenberg of June 15th was quite furious: 

"Einstein once again has expressed himself publicly on quantum mechan- 
ics, namely in the issue of Physical Review of May 15th (in cooperation 
with Podolsky and Rosen - not a good company, by the way). As is well 
known, this is a catastrophe each time when it happens." (Pauli, 1985-99, 
Vol.2, Doc. 412, p. 402) 
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From our present vantage point this judgment is clearly too harsh, but it shows the 
attitude of the 'younger generation' towards Einstein's concerns. In fact, Pauli un- 
derstood (even if he did not accept) Einstein's point much better than many others, 
as his intervention in the Born-Einstein debate on Quantum Mechanics shows (Born 
2005; Pauli to Born, March 31, 1954). Whatever one's attitude on this issue is, it is 
certainly true that the EPR argumentation has engendered an uninterrupted discussion 
up to this day. The most influential of John Bell's papers on the foundations of quan- 
tum mechanics bears the title "On the Einstein-Podolsky-Rosen paradox" (Bell 1964). 
In this publication Bell presents what has come to be called "Bell's Theorem", which 
(roughly) asserts that no hidden-variable theory that satisfies a certain locality condi- 
tion can produce all predictions of quantum mechanics. This signals the importance of 
EPR's paper in focusing on a pair of well-separated particles that have been properly 
prepared to ensure strict correlations between some of the observable quantities asso- 
ciated with them. Bell's analysis and later refinements (Bell, 1987) showed clearly 
that the behavior of entangled states is explicable only in the language of quantum 
mechanics. 

This point has also been the subject of the very interesting, but much less known 
work of Kochen & Specker (1967), with the title "The Problem of Hidden Variables 
in Quantum Mechanics". Loosely speaking, Kochen and Specker show that quantum 
mechanics cannot be embedded in a classical stochastic theory, provided two very de- 
sirable conditions are assumed to be satisfied. The first condition (KS1) is that the 
quantum-mechanical distributions are reproduced by the embedding of the quantum 
description into a classical stochastic theory. (The precise definition of this concept 
is given in the cited paper.) The authors first show that hidden variables in this sense 
can always be introduced if there are no other requirements. (This is not difficult to 
prove.) The second condition (KS2) states that a function u(A) of self-adjoint op- 
erators A representing quantum-mechanical observables has to be represented in the 
classical description by the very same function u of the image /a of A, where / is the 
embedding that maps the operator A to the classical observable f A on 'phase space'. 
Formally, (KS2) states that for all A 

fu(A) = u(f A ). (28) 

The main result of Kochen and Specker states that if the dimension of the Hilbert 
space of quantum mechanical states is larger than 2, an embedding satisfying (KS1) 
and (KS2) is 'in general' not possible. 

There are many highly relevant examples — even of low dimensions with only a 
finite number of states and observables — where this impossibility holds. 

The original proof of Kochen and Specker is very ingenious, but quite difficult. In 
the meantime several authors have given much simpler proofs (e.g., Straumann, 2002). 

We find the result of Kochen and Specker entirely satisfactory in the sense that 
it clearly demonstrates that there is no way back to classical reality. Einstein's view 
that quantum mechanics is a kind of glorified statistical mechanics that ignores some 
hidden microscopic degrees of freedom, can thus not be maintained without giving up 
locality or (KS2). It would be interesting to know his reaction to these developments 
triggered by the EPR paper. 

Entanglement is not limited to questions of principle. It has already been em- 
ployed in quantum communication systems, and entanglement underlies all proposals 
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of quantum computation. 



4 Special Relativity as a symmetry principle 
4.1 Historical origin and conceptual meaning 

The principle of relativity goes back at least to Galileo. The idea that mechanical ex- 
periments cannot reveal an overall uniform and inertial (rectilinear) motion became 
known as the 'Galilean Principle of Relativity'. In Newtonian mechanics it is ex- 
pressed mathematically by the invariance of its equations of motion under the Galilean 
group. This mathematical statement has two interpretations, whose physical conno- 
tations differ in a subtle way. The first interpretation, called the 'passive' one, is that 
of a mere change of reference frames while keeping the system under study fixed. In 
the second interpretation, called the 'active' one, one keeps the reference system fixed 
while changing the state of motion of the system under study. If the physical world just 
consisted of these two objects, the reference system and the system under study, these 
two interpretations would be equivalent, since both amount to stating a relative change 
in the state of motion and there is nothing more to state. However, this is not the sit- 
uation usually encountered in physics. Typically, one has a system S to be studied, 
a reference frame F (which can be thought of as a physical system in its own right), 
and the rest R of the physical universe, parts of which may at times interact with S 
but which can otherwise be neglected. In the passive interpretation we only change the 
frame F, that is, we change the relative state of motion between F and the totality of 
other systems, here denoted by S + R. In the active interpretation we only act on S, in 
which case the cut is between S and F + R. 

So even if dynamically silent, the presence of R is important for the interpretation 
of symmetries. This is because a symmetry connects physically distinguishable states, 
thereby mapping solutions of the equations of motion to other, distinguishably differ- 
ent solutions. In the language of Hamiltonian mechanics this means that the Hamil- 
tonian function that generates the motion is invariant under the symmetry operation, 
but other observables need not be. This is precisely the difference between a physi- 
cal symmetry and a gauge transformation. Unfortunately this difference is sometimes 
blurred by speaking of "gauge symmetries". 

After the establishment of the principle of relativity in mechanics, the natural ques- 
tion to ask was whether non-mechanical phenomena could reveal preferred states of 
inertial motion. Such a preference was strongly suggested by various 'ether' theories 
of light and other electromagnetic phenomena during the 19th century. In fact, Newton 
already expressed his firm belief in some sort of force-mediating 'ether' . In a famous 
letter to Robert Bentley, Newton wrote in 1692: 

"That gravity should be innate inherent & essential to matter so yt one 
body may act upon another at a distance through a vacuum wthout the 
mediation of any thing else by & through wch their action of force may 
be conveyed from one to another is to me so great an absurdity that I 
believe no man who has in philosophical matters any competent faculty 
of thinking can ever fall into it." (Newton, 1961, p. 254) 
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All optical and electromagnetic experiments, however, failed to show any trace 
of an ether rest-frame. This was hard to reconcile with Maxwell's equations, which 
predicted an invariable speed c for electromagnetic waves in matter free space, and 
which were therefore thought to hold only in the ether's rest frame. The solution 
to this problem was first given by Lorentz (1904) and Poincare (1906). They found 
that instead of being Galilean invariant 12 Maxwell's equations are invariant under the 
Lorentz group. Einstein independently derived this result in his 1905 paper on Special 
Relativity (henceforth abbreviated SR), but, unlike Lorentz and Poincare, gave a direct 
physical meaning to the Lorentz transformations in terms of measurements of lengths 
and times. 13 One may say that Einstein established them on a kinematical rather than 
dynamical basis, though one should add here that this distinction is only defined rel- 
ative to the assumption that the "rods" and "clocks" entering the kinematical consid- 
erations eventually obey dynamical laws compatible with Lorentz invariance. If this 
is granted, the FitzGerald-Lorentz contraction, for example, can be understood kine- 
matically (i.e., as a result of a fundamental symmetry that is postulated to be realized 
by all fundamental matter-equations) rather than dynamically (i.e., as consequence of 
a complicated dynamical interaction between the measuring rod and the ether). Note 
that these two viewpoints are not mutually exclusive. 14 But the shift in emphasis es- 
tablishes a symmetry principle with potentially far superior heuristic power. 

In summary it seems fair to say that in 1905 SR seemed palpably close after all the 
preliminary work done by various people. But apparently it needed an unprejudiced 
newcomer to take the final step. 

4.2 The Lorentz group 

The new understanding of the Lorentz transformations as fundamental symmetries 
induced a very powerful selection principle for dynamical laws: All fundamental dy- 
namical laws of Nature should be Lorentz invariant. 15 By this we mean: (1) there 
is an action of the Lorentz group on state space; (2) this action maps solution curves 
to solution curves. (An alternative but equivalent definition uses observables rather 
than states.) After Minkowski's seminal work, as a result of which SR was gradually 
put into its modern mathematical form, this task could be approached in a systematic 
fashion. 

Minkowski realized that the Lorentz group could be understood as the auto- 
morphism group of a geometric structure on spacetime, which is as follows: The 
model for spacetime is a four-dimensional real affine space whose underlying vector 

12 It does not seem to be widely appreciated that a precise statement of Galilean non-invariance needs 
to invoke restrictive assumptions concerning the type of action, such as locality. It is instructive and 
amusing to note that there exists a non-local implementation of the Galilean group which makes it a 
symmetry group of Maxwell's equations; see, e.g., Sec. 5.9 in (Fushchich etal, 1993). 

13 See (Damour, 2005) for a lucid recent account on Poincare's contribution to SR. 

14 In this respect Pauli wrote in his 1921 review article on Relativity: "The contraction of a measuring rod 
is not an elementary but a very complicated process. It would not take place except for the covariance 
with respect to the Lorentz group of the basic equations of electron theory, as well as those laws, as 
yet unknown to us, which determine the cohesion of the electron itself (Pauli, 1958, p. 15). 

15 The reader should be aware that there is some confusion in the literature as to the different meanings 
of terms like 'invariant', 'covariant', etc. 

16 The affine structure of spacetime is usually motivated by the law of inertia, by means of which one 
identifies inertial trajectories with (a subset of) the 'straight lines' of affine geometry. 



19 



space, IR 4 , is endowed with a non-degenerate, symmetric bilinear form 77 of signature 
(— , +, +, +). 17 In appropriate coordinates one has r/^ = diag(— 1, 1, 1, 1). r\ is called 
the Minkowski metric and the affine space endowed with it is called Minkowski space. 
The homogeneous Lorentz group is then characterized as the set of invertible linear 
transformations that leave rj invariant. 

The Galilean group, too, can be characterized as the automorphism group of some 
geometric structure on spacetime, which is again modelled on real four-dimensional 
affine space. The 'geometry' now includes an absolute simultaneity structure and a 
fixed euclidean metric on the simultaneity hypersurfaces. We stress that, at least as 
far as mechanics is concerned, the usual terminology 'non-relativistic' versus 'rel- 
ativistic' is quite inappropriate. Newtonian mechanics is perfectly relativistic: the 
principle of relativity being implemented by the Galilean group. What distinguishes 
Lorentz-invariant from Galilean-invariant mechanics is not the validity of the relativity 
principle, but the structurally different implementations of it. 

The major structural differences between the (homogeneous, proper, or- 
thochronous) Galilean group and the Lorentz group is, that the latter is simple, 18 
whereas the former is not even semi-simple due to the invariant abelian subgroup 
formed by the pure boost transformations. In contrast, for the Lorentz group, the 
set of pure boosts do not even form a subgroup. This is more than just a mathematical 
curiosity. It implies that the relation of 'being relatively unrotated' is not transitive 
among inertial reference frames. If K' is boosted relative to K and K" is boosted 
relative to K', then K" is boosted as well as rotated relative to K, unless the boost 
velocities of K' and K" are collinear. A well known early application of this feature, 
which is not present in the Galilean group, was the downward correction by 50% of 
the spin-orbit coupling and consequently of the fine-structure intervals in atomic spec- 
tra, which was first pointed out by Thomas (1927). 19 The same effect is even more 
pronounced in nuclear physics, where the strong acceleration due to the nuclear force 
leads via the 'Thomas correction' to a much larger spin-orbit coupling than that due to 
the electromagnetic interaction, thereby giving rise to the so-called 'inverted doublets'. 
The non-transitivity of the relation 'being-relatively-unrotated' has more recently also 
entered the discussion of large-scale astronomical reference frames. 20 

4.3 Far-reaching consequences 

Replacing the Galilean group with the Lorentz group requires a modification of the 
dynamical laws of mechanics, since the latter is supposed to act through dynamical 
symmetries. The 'heuristic power' associated with this requirement now comes to the 
fore (cf. the end of Sec. 14.11) . Consider, e.g., the simplest case of a free point-particle of 

17 In the present context it is a matter of convention whether one chooses (— , +, +, +) ['mostly plus'] or 
(+, — , — , — ) ['mostly minus']. But in more exotic situations, like for non-orientable spacetimes, the 
overall sign generally matters; see, e.g., (DeWitt-Morette & DeWitt, 1990). 

18 A group is called simple if it has no non-trivial (i.e. other than the group itself and the group formed by 
the neutral element alone) invariant subgroups. It is called semi-simple if it has no non-trivial abelian 
invariant subgroups. 

19 A manifestly Lorentz invariant treatment using the Dirac equation automatically takes care of this 
effect. 

20 See, e.g., (Klioner & Soffel, 1998). For a review of algebraic and geometric aspects of the Lorentz 
group see, e.g., (Giulini, 2005b). 
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mass in®. Classically its dynamics is fully described by an action whose Lagrangian 
is just its kinetic energy, ^mov 2 . A straightforward Lorentz invariant modification, 
which approaches the classical law in the limit c — ► oo, is given by the action 



-Sparticie = -m c 2 J dr = -m c 2 J \J 1 - v 2 /c 2 dt , (29) 

where dr = ^J^q^dz^dz" is the proper time along the worldline z^{t) of 
the particle — obviously a Lorentz-invariant quantity. From the Lagrangian L = 
—rriQ(?yJl — v 2 /c 2 , the expressions for energy and momentum immediately follow 
by standard Lagrangian methods (again we set y(v) = 1/ \/l — v 2 /c 2 ): 

j(v) m c 2 , (30) 
^(v)niQV. (31) 

Together they form the momentum four-vector p^ = (E/c,p), which under a Lorentz 
transformation, given by the matrix Vt, transforms like 

p ^_^y = jL /y. ( 32) 
Clearly, the Minkowski-square of the four-momentum is an invariant (we write p = 

my- ' 

= P 2 - E 2 /c 2 = -m 2 c 2 , (33) 

showing that the following relation between energy and momentum is a Lorentz co- 
variant one: 

E 2 = c 2 {p 2 + m 2 c 2 ) . (34) 

This equation replaces the familial - E = p 2 /2mo of Newtonian mechanics and plays a 
central role throughout special-relativistic quantum (field) theory. One of its prominent 
features is that E enters quadratically. 

These somewhat formal derivations (no interactions have been discussed yet) can 
be complemented by an analysis of elastic two-particle scattering processes, which 
shows that d3TI > is the unique generalization of the classical equation p = m^v com- 
patible with momentum conservation and Lorentz invariance. From JTil one may 
deduce the expression for the kinetic energy, E^ = moc 2 (~/(v) — 1), which is just 
(l30l . properly normalized so that E^ m = for v = 0. 

The normalization of energy in d30l is not determined by general methods (which 
always allow for additive constants). The last of Einstein's five papers of 1905, just 
about three pages long, shows that the normalization adopted in d30l is more than just 
a convenient choice. More precisely, using (1) the principle of relativity, (2) conserva- 
tion of energy, (3) the existence of a Newtonian limit, and (4) the transformation law 
for the energy of an electromagnetic wave, as derived from the Lorentz transformation 
properties of the electromagnetic field, Einstein shows that any emission of electro- 
magnetic radiation with energy AE by a body must decrease its inertial rest mass mo 

21 See, e.g., (Giulini, 2005a) for a brief presentation of this argument, which goes back to Lewis & 
Tolman (1909) 
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by AE/c 2 . 22 He further argues that this holds independently of the form into which 
the energy extracted from the body is turned. From this he jumps to the conclusion 
that all of the inertial mass of a body is a measure of its energy content; later this was 
expressed in the now most famous formula 

E = mc 2 . (35) 

The implications of this far reaching insight can hardly be overrated. It provided 
the first means to estimate the enormous magnitude of nuclear binding-energies. To- 
day d35t is often taken as a symbolic expression for the ambivalent 'nuclear age'. But 
it should be stressed that d35l only allows to 'weigh' binding energies. It neither ex- 
plains them nor does it explain any of the nuclear processes, like fission or fusion, 
which belong to the realm of nuclear physics proper. The weight of binding energies 
becomes even dominant on sub-nuclear scales. For example, according to Quantum 
Chromodynamics, the mass of a proton (made up of three light quarks, two 'up' and 
one 'down', interacting via gluon exchange) is almost entirely due to interaction ener- 
gies. The quark masses themselves contribute only about 2%. 

On a more fundamental level d35l changed our concept of matter radically, in that 
it opens up the possibility for different forms of matter to change into each other. To be 
sure, the 'channels' along which these transmutations occur are constrained by various 
conservation laws. But there can be no doubt that this puts an irreversible end to the 
idea of naive atomism, since everlasting and unchanging elementary objects simply 
cannot exist. Rather, modern high-energy particle physics speaks and thinks in terms 
of creation and annihilation processes. 

4.4 The current experimental status of SR 

Modern particle physics would be unthinkable without SR. Leaving aside the concep- 
tual implications just mentioned, it has far-reaching kinematical consequences. For 
example, proton-antiproton collisions at Fermilab's Tevatron take place at energies of 
about 2 TeV, which is 2000 times the rest energy of the proton. In such machines there 
clearly is ample opportunity for possible deviations from SR to manifest themselves. 
Since these experiments, however, are not primarily designed to test SR, the quantities 
observed in them will depend in complicated ways on the fundamental assumptions of 
SR. This makes it hard to infer good quantitative upper-bounds for violations of SR 
from such experiments, even if, energetically speaking, they take place in the "ultra- 
relativistic regime". 

Experiments specifically designed to test the principle of relativity basically probe 
for dynamical effects of preferred reference frames. A good candidate for such a 
preferred frame is one in which the cosmic microwave background (CMB) appears 
most isotropic (i.e., without dipole anisotropy). It is called the CMB-frame. Ever since 
observation of the dipole anisotropy with the Cosmic Background Explorer (COBE) 
we know that the barycenter of our solar system moves relative to the CMB-frame at a 
speed of 370 km/s (Kogut et al. 1993). 

Let us suppose that the CMB-frame, K, is such that in K the velocity of light c = 1 
(in appropriately chosen units) in all directions. Note that this implies that clocks in 

22 See Stachel & Torretti (1982) for a careful account of Einstein's argument, which also saves it from an 
unwarranted but often repeated criticism. 
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K are Einstein-synchronized. The transformation formulae between K (coordinatized 
by (x, t)) and an inertial frame K' (coordinatized by (x', t')) moving with relative 
velocity v = nv (n • n = 1) with respect to K are then of the general form (Mansouri 
& Sexl, 1976) 

x' = d(v)x + n(n • x) (b{v) — d(v)) — b(v)v t , 

t' = a{v)t + e{v) ■ x . ( " 6) 

Here a, b, d are functions of v whose interpretation is easily inferred: a is the factor 
of time dilation (for this reason we wrote t' as function of x! rather than x), and b and 
d are the factors of longitudinal and transverse length contraction respectively. These 
functions are to be determined experimentally. The values that SR assigns to them are 



ass.(v) = l/b SR (v) = \Jl - v 2 , d S R{v) = 1 . (37) 

The vector e is determined by a, b, d once the convention for clock synchronization in 
K' is chosen. For example, for Einstein-synchronization one has 

e{v) =e E {v) = -v °P , (38) 
b{v){l — v z ) 

leading to the familiar expression £*sr(^) = —v for SR. If we agree to Einstein- 
synchronize clocks in K', the following expression can be derived for the velocity 
of light in K' (Mansouri & Sexl 1976) 23 

C <(«, .) , (39) 

o(w) Vcos 2 9 + b 2 (v)d- 2 (v)(l - v 2 ) sin 2 9 

where 9 is the angle between the light ray and v as measured in K' . This reduces to 
d = 1 for the values given in d37t . but depends on 9 in the general case. The invariance 
under 9 — > 9 + it reflects the Einstein-synchronization of the clocks in K' . 

We want to stress the following conceptually very important point: The expression 
for c'(9,v) depends on the choice of clock synchronization in K' . This means that 
if one uses it to calculate light travel-times along open paths (i.e., paths that do not 
begin and end at the same point in space), the result will also depend on that choice. 
However, the calculated travel times will be independent of one's synchronization con- 
vention if the light paths are closed in space, since in that case only a single clock is 
involved. This is the case in the Michelson-Morley and Kennedy-Thorndike experi- 
ments discussed below. 

To second order in v we have for a, 6, d: 

a{v) rs 1 + av 2 , b(v) « 1 + pv 2 , d{v) w 1 + 5v 2 . (40) 

Experiments checking round-trip travel times of light involve 1/c', which to second 
order is given by: 

' 1 + 03-5- \)v 2 sm 2 9 + (u- p + l)v 2 . (41) 



c'(9,v 



The relevant formula in this reference, (6. 17), has a misprint: d 2 in the denominator should be d 2 , as 
shown in <39> . 
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In SR one has a = —(3 = — 1/2 and 5 = so the expressions in parentheses vanish. 

Experiments checking the 9 dependence of c(9, v) are commonly referred to as 
"Michelson-Morley" experiments; those checking the v 2 dependence as "Kennedy- 
Thorndike" experiments. The most stringent upper-bounds for the relative ^-variation 
of c'{9,v) provided by modern measurements are of the order of 10~ 15 . For the v- 
variation, they are of the order 10~ 12 . To translate these results into statements about 
the coefficients {(3 + 5 — h) and (a — (3 + 1) one has to assume some value for v, that 
is, one has to make an assumption about the value of our present velocity with respect 
to the potentially preferred frame. Since the latter is presently unanimously stipulated 
to be the CMB -frame, 24 with respect to which we move at a speed of 370 km/s or 
1.23 • 1CP 3 times the speed of light, one sets v = 1.23 • 1CP 3 . With this value, the best 
current estimates (at the one-er level) of the upper-bound for the coefficients in PTl i are 
(see Miiller etal, 2003; Wolf etal, 2003): 

\f3-5-\\ < 3.7 • 10~ 9 (MM-experiment) , (42) 
\a-P + l\ <6.9-10~ 7 (KT-experiment) . (43) 

To obtain upper-bound for the three parameters, a, (3, and 5, an independent third 
experiment is needed. Experiments that allow independent determination of the factor 
a related to time dilation are called "Ives-Stilwell" experiments. In the latest version 
one does such experiments using so-called double Doppler-spectroscopy (with Lasers) 
on 7 Li + ions, moving at a speed of 19 000 km/s. The best value today is (Saathoff, 
2003): 

1 2a + 1| < 2.2 • 10~ 7 (IS-experiment) . (44) 

For more on the most recent experimental situation in SR, see (Ehlers & Lammerzahl, 
2006) 

The upper-bound on the value of a has additional conceptual significance. We 
mentioned that Einstein-synchronization in K' fixes e to be the function given by 
d38l . Now, as was probably first realized by Eddington (1924, p. 11), in SR Einstein- 
synchronization is equivalent to synchronization by "slow clock- transport". In the 
more general setting discussed here, one can show that the value for e corresponding 
to slow clock-transport is given by 

_ _ a'(v) 

e(v) = £ T (> = n , (45) 
b(v) 

where n = v/v and a' is the derivative of a. Hence the two synchronizations agree 
if and only if the expressions in d38l and d45l agree. This is the case when a(v) = 
osr(^) ( we obviously require that a(v) = 1 for v = 0). The upper-bound d44l may 
therefore also be read as the upper-bound for possible discrepancies between Einstein 
synchronization and synchronization by slow clock-transport. 

4.5 Relativistic quantum field theory 

From the very beginning, Lorentz invariance was a guiding principle in the devel- 
opment of quantum theory. Black-body radiation belongs above all to the quantum 

24 Though it is considered to be unlikely, it is not impossible that the gravitational-wave background — 
once it is observed — 'moves' relative to the CMB frame and therefore defines another potentially 
preferred frame. 
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theory of the electromagnetic field, which had to be relativistically invariant. In this 
connection an important step by Jordan and Pauli (1928) should be mentioned. These 
authors introduced time-dependent field operators for the charge-free Dirac radiation 
field and determined the commutators of two field components evaluated at different 
spacetime points. For the field operators F flll (x) these commutators can be expressed 
in a manifestly invariant form with the help of the now famous invariant Jordan-Pauli 
distribution. The physical meaning of these results in terms of basic uncertainty rela- 
tions in field measurements was later clarified by Bohr & Rosenfeld (1933). 

As is well-known, Schrodinger originally considered a Lorentz invariant equation, 
now known as the Klein-Gordon equation. Since this equation gave the wrong fine- 
structure splitting, Schrodinger restricted himself to the more modest goal of a non- 
relativistic wave mechanics. The decisive next step was taken by Dirac who succeeded 
in generalizing Pauli's description of spin-^ particles to a relativistic wave equation. 
Initially, Dirac 's theory was considered a single-particle theory, but this interpretation 
was beset with great difficulties coming from negative energy states. These states 
could not consistently be eliminated and time-dependent external fields could cause 
transitions from positive to negative energy states. 

Reinterpretation of Dirac's single particle theory 

Dirac's solution to this problem was his so-called 'hole theory'. The ground state 
then becomes stable because all negative energy states are considered occupied so that 
transitions of positive energy electrons into negative energy states are forbidden by the 
Pauli Exclusion Principle. Furthermore, the vast 'sea' of negative energy particles is 
declared to be invisible. A 'hole' in this sea was interpreted by Dirac as a particle of 
positive energy and positive charge. At first, Dirac suggested that such particles be 
identified with the proton. It was soon pointed out, however, by Oppenheimer (1930) 
that this was unacceptable because it would imply that the hydrogen atom be very 
short-lived. Dirac accepted this criticism and proposed the existence of anti-electrons: 

"A hole, if there were one, would be a new kind of particle, unknown to 
experimental physics, having the same mass and opposite charge to the 
electron. We should not expect to find any of them in nature, on account 
of their rapid rate of recombination with electrons, but in high vacuum, 
they would be quite stable and amenable to observations." (Dirac, 1931, 
p. 61) 

Many of Dirac's colleagues were shocked by the audacity of his ideas. As an 
example we recall Pauli's skepticism, expressed in his famous article on wave me- 
chanics (1933) before the discovery of the positron. First he point out that if there 
were anti-electrons there should also be anti-protons. He then writes: "The factual 
absence of such particles then is reduced to a special initial state, in which there is 
indeed only one kind of particles. This appears to be unsatisfactory already because of 
the fact that the laws of nature in this theory are symmetrical with respect to electrons 
and anti-electrons" (Pauli, 1933, p. 246). The matter-antimatter asymmetry is still a 
major problem of cosmology, about which we shall make some remarks in Sec. 14.91 

After Anderson's discovery of the positron, it became clear that future work on 
quantum electrodynamics (QED) of spin-i particles had to be based on hole theory, 
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or something closely related to it. Through the work of Jordan and Wigner it became 
clear that the Dirac field had to be quantized by imposing anti-commutation relations. 
With the resulting elegant formalism it was possible to write the theory of electrons 
and positrons in a completely symmetric form under exchange of particles and anti- 
particles. In this formulation, which can be found in any modern quantum-field-theory 
textbook, the Dirac sea has no place "except as a poetic description for forming the 
electromagnetic current" (Wightman, 1972, p. 100). 

Heisenberg and Pauli (1929, 1930) were the first to attempt a general formulation 
of QED as a dynamical relativistic theory of quantized fields. With all these devel- 
opments a revolution had taken place that was driven by the problem of reconciling 
quantum mechanics and special relativity. All further developments are based on these 
foundational pillars. Below we shall make a few remarks about the tortuous and ongo- 
ing history of quantum field theory. What is amazing is that this theory, despite all its 
intrinsic difficulties, makes the most precise predictions in all of physics. What exactly 
lies behind this success is still unclear. 

Renormalization theory 

In the early 1930s a number of processes, such as radiative pair creation and annihi- 
lation, were successfully computed in the Born approximation. But in higher orders 
troublesome divergences remained. Weisskopf showed that compared to the single 
electron theory the most divergent terms for the self-energy cancelled, but a logarith- 
mic divergence remained. It was realized only after World War II that this remaining 
divergence would also disappear after a mass renormalization. A central problem early 
on was that of vacuum polarization. This was studied by a number of authors. Antic- 
ipating the idea of charge renormalization, they were able to extract correct finite pre- 
dictions for observable effects, e.g., in the energies of bound electrons. In even higher 
orders in the fine structure constant, a fascinating phenomenon turned up: Maxwell's 
equations are corrected by very small non-linear terms in the field strengths and their 
derivatives, leading for instance to photon-photon scattering. Heisenberg's subtraction 
procedure lead to finite expressions, as was shown by Euler, Kockel and Heisenberg 
(Euler & Kockel, 1935; Heisenberg & Euler, 1936)). Shortly afterwards, Weisskopf 
(1936) not only simplified their calculations but also gave a thorough discussion of the 
physics involved in charge renormalization. Weisskopf related the modification of the 
Lagrangian of Maxwell's theory to the change of the energy of the Dirac sea as a func- 
tion of slowly varying external electromagnetic fields. (Avoiding the old fashioned 
Dirac sea, one could now interpret this effective Lagrangian in terms of the interac- 
tion of a classical electromagnetic field with the vacuum fluctuations of the electron 
positron field.) After a charge renormalization this change is finite and gives rise to 
electric and magnetic polarization vectors of the vacuum. These investigations showed 
that the quantum vacuum has very interesting properties, a subject we shall take up in 
Sec. l6.6l in connection with the current Dark Energy problem. 

Notwithstanding these successes, most of the leading physicists were not happy 
with the subtraction procedures and repeatedly expressed their misgivings. In the late 
1940s renormalization theory was developed in a systematic manner with the help of 
new, manifestly Lorentz invariant techniques. The infinities could then be sidestepped 
in an unambiguous manner. The new powerful methods of Feynman, Schwinger, 
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Tomonaga, and Dyson made it possible to perform higher-order perturbation calcula- 
tions for QED which turned out to be in spectacular agreement with experiment. With 
these developments QED became one of the most brilliant successes in the history of 
physics. 

Quantum field theory provides answers to some of the most profound questions 
about the nature of matter. It explains why there are two classes of particles — fermions 
and bosons — and how their properties are related to their intrinsic spin (spin-statistics 
theorem). The mysterious nature of indistinguishability in quantum mechanics is un- 
derstood, because identical particles are created by the same underlying field. 

QED became a model for non-Abelian gauge theories and the development of 
the highly successful Standard Model of particle physics. Since the early history of 
gauge theories is strongly tied to General Relativity (henceforth abbreviated GR), we 
postpone further discussion both of this subject, and of more recent developments, 
which include the gravitational interaction, to Sec. 16.41 

4.6 Group-theoretic background of relativistic quantum field theory 

Mathematically speaking, the content of SR is largely the requirement of Lorentz in- 
variance. A characterization of the impact of SR on other branches of physics should 
therefore also include some statements about specific properties that can be traced to 
this requirement. This is particularly interesting in Quantum Field Theory, where as- 
pects of representation theory become important. The representation theory as such, 
however, can be discussed using classical rather than quantized fields. 

For simplicity we ignore space and time reflections and consider the group M 4 x 
SX(2,C), which is the double (and universal) cover of the connected component of 
the inhomogeneous Lorentz group. In what follows, we will simply refer to it as the 
Poincare group. 

The classical fields ^ under consideration are maps from spacetime (Minkowski 
space) to some vector space V. The space V carries a finite dimensional irre- 
ducible representation /j(p>9) of SX(2,C). 25 With the appropriate choice of an inner 
product, the infinite-dimensional linear space of such fields carries a unitary repre- 
sentation of the Poincare group. The free (i.e., linear) classical field equations of 
Klein-Gordon, Weyl (Neutrino equation), Dirac, Maxwell, Proca, Rarita-Schwinger, 
Bargmann-Wigner and Pauli-Fierz can then collectively be understood as projection 
conditions onto irreducible subspaces (possibly including space and time reflections) 
in this space. 

Investigations into the representation theory of the Poincare group started with a 
seminal paper by Wigner (1939). This was one of the first serious mathematical papers 
on the representation theory of non-compact Lie groups. 26 Later Mackey generalized 
Wigner's method to what is now known as the theory of induced representations. This 
'Mackey Theory' reduces to the present case if one specializes to semi-direct products 
with one Abelian factor (here the translations). A nice account of this is given, e.g., by 

25 Here p and q are zero or integer multiples of 1 jl. 2p and 2q denote the numbers of 'undotted' and 
'dotted' spinor indices respectively, carried by the field. 

26 Remarkably, before sending it to Annals of Mathematics, Wigner submitted his paper to the less pres- 
tigious American Journal of Mathematics, where it was rejected with the remark that "this work is not 
interesting for mathematics" (see Wigner, 1993, Part A, Vol 1, p. 9). 
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Niederer & O'Raifeartaigh (1974). 

Wigner's construction of irreducible representations can briefly be described as 
follows: first one replaces the field ty(x) on spacetime by its Fourier transform 4>(k) 
on momentum space. An element (a, A) £ M 4 x SL(2, C) acts on ^>(k) via 

(fc) i-» e^-D^fA)*^- 1 *) . (46) 

This immediately shows that irreducible subspaces must consist of fields whose sup- 
port is confined to a single group orbit in momentum space. These orbits decom- 
pose into the following types: two families of infinitely many orbits each, indexed by 
mass m > and given by the two-sheet hyperbolas ko = ±v m 2 + k 2 , the future 
and past light cone, one infinite family (indexed by \i > 0) of one-sheet hyperbolas 
\k\ = -\/ fi 2 + &q, and finally the origin k = 0. 

The condition of having support within a group orbit, say of the first type, translates 
into a differential equation for ty, which in case of orbits of the first two types (m > 0) 
is just the Klein-Gordon equation for each component of 

(□ - m 2 )* = 0. (47) 

If we restrict ourselves to functions with support on one such orbit, say O, Wigner's 
trick consists of picking a reference point k* on O and an element Ak £ SL(2, C) for 
each k on O such that Akk* = k. Using Ak, Wigner now redefines the basic field as 
follows: 

=flM(A-i)f(i), (48) 



The field \Pvf obeys a transformation law of the form of (1461 . the only difference being 
that Z)(p-9) (A) gets replaced by 

D^ p ' q \w(k, A)) , where W {k , A) := A^ k AA k . (49) 

What may look like a complication is, in fact, a crucial simplification, due to the 
obvious fact that W(k,A)k* = fc*. One says that W(k,A) lies in the subgroup 
Stab(fc*) C SL(2, C) of elements that fix ('stabilize') fc*. One thus sees that an ir- 
reducible representation of the Poincare group is obtained by imposing a simple pro- 
jection condition on ^>w, saying that it assumes values in a subspaces of V that is 
irreducible under the group Stab(fc*). If translated back to the fields ^f(x), such condi- 
tions become the wave equations which complement a condition such as d47l of having 
support on one orbit only. Regarding the groups Stab(&*), one has 



Stab (A;* 



SU(2) for k* timelike (massive case) 

E{2) for k* lightlike and ^ (massless case) 

SL(2,M) for fe* spacelike (tachyonic case) 

SL{2,C) forA;* = 0. 



The massive cases are thus classified according to the value for mass (picking 
the orbit) and spin (classifying the unitary irreducible representation of the stabilizer 
subgroup, here SU(2)). The massless cases are classified according to the unitary irre- 
ducible representations of E(2), the double cover of two-dimensional Euclidean mo- 
tions. Here there are many more representations than seem physically relevant. Those 
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which represent the 'translations' in E(2) non- trivially are all infinite dimensional and 
are usually discarded (they correspond to infinitely many 'internal' degrees of free- 
dom). The remaining representations of the one-parameter subgroup of rotations are 
classified by a single number, helicity, which is either zero or a positive-integer mul- 
tiple of 1/2. The remaining cases have so far not found applications with a clear 
physical interpretation, though they appear in various guises in some versions of string 
theory. All non-trivial unitary irreducible representations of SX(2,R) and SX(2,C) 
are necessarily infinite dimensional and were classified by Bargmann (1947). One is 
thus left with the massive and massless cases. 

The irreducible representation-spaces are Hilbert spaces (of square integrable func- 
tions on an orbit in momentum space), which in the physically relevant cases are de- 
noted by H m:S (where s refers to spin for m > and to helicity for m = 0). In rel- 
ativistic quantum field theory they serve as definition of 'one-particle Hilbert spaces', 
which are used as elementary building blocks for the total Hilbert space. This is where 
the dictum, often attributed to Wigner, comes from that an 'elementary particle' is a 
unitary irreducible representation of the Poincare group. 

In relativistic quantum field theories processes of pair creation and annihilation 
are dynamically unavoidable. Hence it would be inconsistent to limit oneself to one- 
particle spaces H m , s . Particles of type (m, s) should be represented by their entire 
Fock space 

3~ m ,s = 0W m s , (51) 

where (gin either denotes the symmetrized (for 2s even) or antisymmetrized (for 2s 
odd) n-fold tensor product. The total Hilbert space is then the tensor product over 
all Fock spaces for all particle species considered. This is the arena where scattering 
states in perturbative Quantum Field Theory live. 27 

4.7 The rise of supersymmetry 

One issue that attracted much attention during the 1960s was, whether the observed 
particle multiplets could be understood on the basis of an all embracing symmetry 
principle that would combine the Poincare group with the internal symmetry groups 
displayed by the multiplet structures. This combination should be non-trivial, i.e., 
not a direct product, for otherwise the internal symmetries would commute with the 
spacetime symmetries and lead to multiplets degenerate in mass and spin (see, e.g., 
O'Raifeartaigh, 1965). Subsequently, a number of no-go theorems appeared, which 
culminated in the now most famous theorem of Coleman & Mandula (1967). This 
theorem states that those generators of symmetries of the »S-matrix belonging to the 
Poincare group necessarily commute with those belonging to internal symmetries. The 

27 As a consequence of a theorem due to Rudolf Haag, it is known that Fock space cannot be the repre- 
sentation space for the fundamental equal-time commutation relations in case of translation invariant 
theories of interacting fields (see, e.g., the later (reprint) edition of Streater & Wightman, 1963). Fock 
space, however, still plays a useful role for displaying scattering states and S-matrices. 
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theorem is based on a series of assumptions 28 involving the crucial technical condition 
that the S'-matrix depends analytically on standard scattering parameters. What is 
less visible here is that the structure of the Poincare group enters in a decisive way. 
This result would not follow for the Galilean group, as was explicitly pointed out by 
Coleman & Mandula (1967). 

One way to avoid the theorem of Coleman & Mandula is to generalize the notion 
of symmetries. An early attempt was made by Golfand & Likhtman (1971), who 
constructed what is now known as a Super-Lie algebra, which generalizes the concept 
of Lie algebra (i.e. symmetry generators obeying certain commutation relations) to 
one also involving anti-commutators. In this way it became possible for the first time 
to link particles of integer and half-integer spin by a symmetry principle. It is true that 
supersymmetry still maintains the degeneracy in masses and hence cannot account for 
the mass differences in multiplets. But its most convincing property, the symmetry 
between bosons and fermions, suggested a most elegant resolution of the notorious 
ultraviolet divergences that beset Quantum Field Theory. 

It is remarkable that the idea of a cancellation of bosonic and fermionic contri- 
butions to the vacuum energy density occurred to Pauli. In his lectures on "Selected 
Topics in Field Quantization", delivered in 1950-51 and still in print, he posed the 
question "whether these zero-point energies [from Bosons and Fermions] can com- 
pensate each other" (Pauli, 2000, p. 33). He tried to answer this question by writing 
down the formal expression for the zero point energy density of a quantum field of 
spin j and mass rrij > (Pauli restricted attention to spin and spin 1/2, but the 
generalization is immediate): 

4tt 2 ^ = (-l) 2i (2j + 1) j dk k 2 y / k 2 +m 2 . (52) 

Cancellation should take place for high values of k. The expansion 

4/ dk k 2 \/k 2 + m 2 = K* + m 2 K 2 - m) log(2K/ mj ) + 0(A" -1 ) (53) 
Jo 

shows that the quartic, quadratic, and logarithmic terms must cancel in the sum over j 
for the limit K — > oo to exist. This implies that for n = 0, 2, 4 one must have 

^(-i)^(2j + l)m? = and ^(-l)^'(2j + 1) log(m,) = . (54) 

j 3 

Commenting on this result, Pauli observed that "these requirements are so extensive 
that it is rather improbable that they are satisfied in reality" (Pauli, 2000, p. 33). 

The idea of supersymmetry is that this is precisely what happens as a consequence 
of the one-to-one correspondence between bosons and fermions. But the real world 
does not seem to be as simple as that. Supersymmetry, if it exists at all, must strongly 

28 The assumptions are: (1) there exists a non-trivial (i.e., / 1) S-matrix which depends analytically 
on s (the squared center-of-mass energy) and t (the squared momentum transfer); (2) the mass spec- 
trum of one-particle states consists of (possibly infinite) isolated points with only finite degeneracies; 
(3) the generators (of the Lie algebra) of symmetries of the S'-matrix contains (as a Lie-subalgebra) the 
Poincare generators; (4) some technical assumptions concerning the possibility of writing the symme- 
try generators as integral operators in momentum space. 
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be broken in the phase we live in. So far no supersymmetric partner of any existing 
particle has been detected, even though some of them (e.g., the neutralino) are cur- 
rently suggested to be viable candidates for the missing-mass problem in cosmology. 
Future findings (or non-findings) at the Large Hadron Collider (LHC) will probably 
have a decisive impact on the future of the idea of supersymmetry, which — whether or 
not it is realized in Nature — is certainly very attractive. 



4.8 More on spin-statistics 

Pauli's proof of the spin-statistics correlation is such an impressive example for the 
force of abstract symmetry principles, that we wish to recall the basic lemmas on 
which it rests. We begin by replacing the proper orthochronous Lorentz group by its 
double (= universal) cover SL(2,C) to include half-integer spin fields. We stress that 
everything that follows merely requires the invariance under this group. No require- 
ments concerning invariance under space- or time reversal are needed. 

Any finite-dimensional complex representation of SL(2,C) is labelled by an or- 
dered pair (p, q), where p and q may assume independently all non-negative integer or 
half-integer values. 29 The tensor product of two such representations decomposes as 
follows 

p+p' q+q' 

£>(p,9) g, D ip',i') =0 D (r ^ , (55) 

r=\p— p'\ s=\q—q'\ 

where — and this is the important point in what follows — the sums proceed in integer 
steps in r and s. With each D^'^ let us associate a 'Pauli Index', given by 

vr : -► ((-l) 2p , (-1) 29 ) G Z 2 x Z 2 . (56) 

This association may be extended to sums of such proceeding in integer steps, 

simply by assigning to the sum the Pauli Index of its terms (which are all the same). 
Then we have 30 

tt(D^ ® L>(pV)) = Tr(D^) ■ tt{D^''^) . (57) 

According to their representations, we can associate a Pauli Index with spinors and 
tensors. For example, a tensor of odd/even degree has Pauli Index (—,—)/(+,+). The 
partial derivative, d, counts as a tensor of degree one. Now consider the most gen- 
eral linear (non interacting) field equations for integer spin (here and in what follows 
Y^(- • • ) simply stands for "sum of terms of the general form (• • • )"): 

E 5 (---)*(+,+) = E *(-,-)» 
= E *(+,+)■ 



These are invariant under 



@ . | *(+,+) (x) ^ (59) 



29 As before, 2p and 2q are the numbers of 'undotted' and 'dotted' spinor indices, respectively. 

30 This may be expressed by saying that the map it is a homomorphism of semigroups. One semigroup 
consists of direct sums of irreducible representations proceeding in integer steps with operation ®, the 
other is Z2 x Z2, which is actually a group. 
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Next consider any current that is a polynomial in the fields and their derivatives: 



(60) 



Then one has 



(6J)(x) = -J(-x). 



(61) 



This shows that for any solution of the field equations with charge Q for the con- 
served current J (Q being the space integral over J°) there is another solution (the 
transformed) with charge — Q. It follows that charges of conserved currents cannot 
be sign-definite in any SL(2,C)-invariant theory of non-interacting integer spin fields. 
In the same fashion one shows that conserved quantities, stemming from divergence- 
less symmetric tensors of rank two, bilinear in fields, cannot be sign-definite in any 
SL(2, C) invariant theory of non-interacting half-integer spin fields. In particular, the 
conserved quantity in question could be energy! 

An immediate but far reaching first conclusion is that there cannot exist a rela- 
tivistic generalization of Schrodinger's one- particle wave equation. For example, for 
integer-spin particles, one simply cannot construct a non-negative spatial probability 
distribution derived from conserved four-currents. Hence these results for c-number 
fields strongly indicate the need for second quantization. 

Upon second quantization the celebrated spin-statistics connection, first proven 
by Fierz (1939), can be derived in a few lines. It says that integer spin fields cannot 
be quantized using anticommutators and half-integer spin field cannot be quantized 
using commutators. Here the already mentioned Jordan-Pauli distribution plays a cru- 
cial role 31 in the (anti)commutation relations, which ensures causality (observables 
localized in spacelike separated regions commute). Also, the crucial hypothesis of the 
existence of an SL(2, C) invariant stable vacuum state is adopted. Pauli ends his paper 
by saying: "In conclusion we wish to state, that according to our opinion the connec- 
tion between spin and statistics is one of the most important applications of the special 
relativity theory." (Pauli, 1940, p. 722). It took almost 20 years before first attempts 
were made to generalize this result to the physically relevant case of interacting fields 
by Liiders & Zumino (1958). 

4.9 Existence of antimatter (CPT-theorem) 

A remarkable general consequence of local relativistic quantum field theory is the ex- 
istence of antimatter, even if the theory is not invariant under charge conjugation (C). 
The CPT-theorem states that the invariance with respect to the proper Lorentz group 
implies the anti-unitary CPT symmetry. (P stands for space reflection and T for time 
reflection.) In the framework of Lagrangian field theory several authors (Schwinger, 
Liiders, and others) contributed to this important result, but the final formulation was 
given by Pauli (1955), assuming, besides locality, the normal spin-statistics connec- 
tion. Soon afterwards, Jost (1957) gave a general proof of the CPT-theorem using 

31 The Jordan-Pauli distribution is uniquely characterized (up to a constant factor) by: (1) it is Lorentz 
invariant; (2) it vanishes for spacelike separated arguments; (3) it satisfies the Klein-Gordon equation. 
The (anti)commutators of the free fields must be proportional to the Jordan-Pauli distribution, or to 
finitely many derivatives of it, either of exclusively even or of exclusively odd order. 
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Wightman's framework of quantum field theory ('axiomatic quantum field theory'). In 
fact, he proved a more precise result. Jost's refined form of the theorem states that the 
CPT symmetry holds if and only if the following weak locality condition is satisfied: 
Consider, for simplicity, a theory with a single neutral scalar field ip(x). In that case, 
the vacuum expectation values of products of field operators (Wightman distributions) 
satisfy 

(fi, <p{xi)ip(x 2 ) ■ ■ ■ <p(x n )ty = (fi, <p(x n )ip(x n -i) ■ ■ ■ <p(xi)n), (62) 

if the {xj} are pairwise spacelike: (xj — Xj) 2 > for all % / j. The elegant proof of 
Jost (1957), which was the starting point for many applications, makes crucial use of 
the elementary fact that the simultaneous reflection in space and time is contained in 
the identity-component of the complex Lorentz group. (See also the classic books of 
Jost (1965), and of Streater & Wightman (1963).) The CPT-theorem has become very 
important, because the electro-weak interactions are not invariant under the separate 
operations C, P and T. It has many applications, and so far no sign of an experimental 
violation of the CPT-symmetry has been found. Because of this deeply rooted symme- 
try, the observed matter-anti-matter asymmetry in the universe is a profound problem. 
In spite of interesting attempts, no satisfactory quantitative explanation has been put 
forward. For a recent review, see Dine & Kusenko (2004). 

5 On the journey to General Relativity 

It is often said that whereas SR was "in the air" around 1905, GR would hardly be 
conceivable without the penetrating thinking of Albert Einstein. His path to GR mean- 
dered, encountered confusing forks, and even included a major U-turn. Einstein's own 
words to describe the ambivalent feelings of the searching mind are unforgettable 

"Im Lichte bereits erlangter Erkenntnisse erscheint das glucklich Erre- 
ichte fast wie selbstverstandlich, und jeder intelligente Student erfaBt es 
ohne zu groBe Miihe. Aber das ahnungsvolle, Jahre wahrende Suchen im 
Dunkeln mit seiner gespannten Sehnsucht, seiner Abwechslung von Zu- 
versicht und Ermattung und seinem endlichen Durchbrechen zur Klarheit, 
das kennt nur, wer es selbst erlebt hat." (Einstein, 1977, p. 138) 32 

This is not the place to give an account of the complex history that led from SR 
to GR (see Renn, forthcoming). But what we can do here is to present some selected 
issues from a physicist's perspective. We start with some early attempts to formulate 
a relativistic theory of gravity and then turn to the question how GR could have been 
discovered within the framework of Poincare invariant field theories. 

5.1 Early attempts 

Soon after the formulation of SR Einstein began thinking about how to fit Newto- 
nian gravity within that framework. Already in his "Jahrbuch paper" (CPAE, Vol. 2, 

32 "In the light of knowledge attained, the happy achievement seems almost a matter of course, and any 
intelligent student can grasp it without too much trouble. But the years of anxious searching in the dark, 
with their intense longing, their alternations of confidence and exhaustion and the final emergence into 
the light — only those who have experienced it can understand it." 
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Doc. 47) he went beyond the framework of SR. He did not seriously consider the pos- 
sibility of a special-relativistic theory of gravity until presented with such a theory by 
Gunnar Nordstrom (Norton 1992, 1993). Except for his attempted rebuttals of Nord- 
strom's theories no notes appear to be extant to document his own early attempts in this 
direction. But later recollections by Einstein make it quite easy to more or less guess 
the essential steps. The following contains our (modern) interpretation of how one 
might proceed along the lines of Einstein's 1933 recollections (reprinted in Einstein, 
1977, english translation in Einstein, 1954). There he says: 

"The simplest thing was, of course, to retain the Laplacian scalar potential 
of gravity, and to complete the Poisson equation in an obvious way by a 
term differentiated with respect to time in such a way, that compatibility 
with special relativity was achieved." (Einstein, 1977 p. 135) 

Einstein obviously refers to replacing the Laplacian A by the d'Alembertian 
in the Poisson equation 

A4> = 4irGp (64) 

where <fi is the gravitational potential, p is the mass density, and G is Newton's gravi- 
tational constant. 

This turns the left-hand side of the Poisson equation into a Lorentz-scalar. But then 
the source term on the right hand side of d64l should also be a scalar, which is neither 
true for the density of mass nor for the density of rest-mass (rest-mass is a scalar, but 
its density is not). Later Laue pointed out to Einstein that the trace T = of the 
energy-momentum tensor was a natural candidate, as Einstein e.g. acknowledges at 
the end of the "Entwurf" paper (cf. CPAE, Vol. 4, Doc. 13, p. 322). This leads to 33 

n<f> = -KT, with n:=4irG/c 2 . (65) 

The next step would be to find the equations of motion for the world line z(t) of a test 
particle (r is the proper time and dots refer to differentiation with respect to r). The 
obvious first guess, 

z? = , (66) 

is clearly impossible, since it implies the overly restrictive integrability condition 
z^d^cf) = 0. However, this problem can easily be taken care of by replacing the 
right-hand side of d66l with its projection orthogonal to z: 

& = - {jT + ^z v /c 2 )d v <p . (67) 

This results in three consistent equations of motion for the three spatial velocity com- 
ponents (the fourth component is, as always, determined by = — c 2 ). 

It is instructive to relate the naive theory based on d65l and d67l to a more system- 
atic treatment based on modern methods using the action principle. Since the physical 
system under consideration consists of the gravitational field <I> (its relation to the field 

33 Recall that T ° = -Too is minus the energy density, due to our convention r\^ v — diag( — , +, +, +). 
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(f) above will become clear soon) and matter. Hence we have three basic contributions 
to the total action, 

•S'tot = <Sfield + ^matter + Sint j (68) 

where Sg e id is the action of the free gravitational field and S- mt that of the interaction 
between the gravitational field and matter. If we assume our $ to satisfy equation d65t . 
their sum is given by 34 

Sfield + S- mt = — ^ / d A x (±0„$0"$ - K$T) . (69) 

KC J 

^matter is the action for the matter system which we only specify in that we assume 
that the matter consists of a point particle of rest-mass mo and a 'rest' that remains 
unspecified. Hence, 5 mat ter = 'S'particie + Sr.o.m (r.o.m = 'rest of matter') where 



-^particle = -m C 2 J dr . 



(70) 



The quantity dr = - ^ — r} iiu dz^dz v is the proper time along the worldline of the 
particle. The energy-momentum tensor of the particle is given by 



T^ix) = m c J dTz»{r)z v (T) 5 {4) (x - z{r)) , (71) 
so that the particle's contribution to the interaction term in (I69t is 

Sim-particle = -mo J dr $(z(t)) . (72) 
Hence the total action can be written in the following form: 

S'tot = - m c 2 J dr (1 + <S>(z(t))/c 2 ) 

- ^ / d\ {\d^d^ - K$T r . . m ) (73) 

KC J 
S'r.o.m • 

We can now relate this theory to the preceding one. Recall that, by construction, 
the field equation for $ that follows from d73l is just d65l with <p replaced by But 
the analogous statement is not true for the equation of motion for the point particle. In 
fact, variation of d73b with respect to z(t) gives (167b . where 

c/) = c 2 m(l + $/c 2 ). (74) 

Because of d67l it is more natural to call <p the gravitational potential. For example, 
— Vc/> is the force on a unit test-mass. Summing up, we may say that a systematic 
treatment retains (I67t but replaces d65l with the same equation in terms of 3>, whose 
relation to <j) is (l74l . In linear approximation <fi = $ and we do get back to the naive 
theory. 

34 Note that <E> has the physical dimension of a squared velocity, k that of length-over-mass. The prefactor 
1 / kc 3 gives <69t the physical dimension of an action. The overall signs are chosen according to the 
general scheme for Lagrangians: kinetic minus potential energy (cf.footnotel33l. 
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Note that the action for the point particle with interaction may be interpreted in 
various ways. One is to say that the inertial mass is changed from too to to = too e^' c 
by the interaction with the gravitational field, thereby becoming spacetime dependent. 
This was in fact one of Einstein's concerns: 

"The law of motion of the mass point in a gravitational field had also to 
be adapted to the special theory of relativity. The path was not so unmis- 
takably marked out here, since the inert mass of a body might depend on 
the gravitational potential. In fact, this was to be expected on account of 
the principle of the inertia of energy." (Einstein, 1977 p. 135) 

Another interpretation, later (1914) considered by Einstein and Fokker (CPAE, Vol. 4, 
Doc. 28), is that the particle moves inertially, though not in the Minkowski metric 
but a conformally rescaled metric: rj^ — > g^ v := e 2 ^^ rj^. This law, which is 
independent of the particle's rest mass, gives a strong hint that "geometrization" is a 
perfect scheme to achieve a universal coupling of gravity to matter. 

The scalar theory outlined so far clearly satisfies the weak equivalence principle, 
according to which all freely falling pointlike test-masses 35 move on the same world 
line for given initial data (spacetime point and four velocity). But this does not imply 
that the acceleration in a gravitational field is independent of the center-of-mass mo- 
tion, such as, e.g., an initial horizontal velocity. To see this, assume the gravitational 
field is static in some frame. Then the particle's equation of motion is equivalent to the 
following 3-vector equation (a dot now signifies a derivative with respect to coordinate 
time t) 

f=-(l-|^1 2 /c 2 ) V0, (75) 

which is almost like the Newtonian equation, were it not for the additional term in 
parentheses on the right-hand side, which diminishes the vertical acceleration at high 
particle velocities. Although this is a quadratic effect in v/c, Einstein considered this 
to be a very serious failure of the scalar theory of gravitation, which made him abandon 
that track. He wrote: 

"These investigations, however, led to a result which raised my strong 
suspicion. According to classical mechanics, the vertical acceleration of 
a body in the vertical gravitational field is independent of the horizontal 
component of its velocity. Hence in such a gravitational field the vertical 
acceleration of a mechanical system or of its center of gravity comes out 
independently of its internal kinetic energy. But in the theory I advanced, 
the acceleration of a falling body was not independent of its horizontal 
velocity or the internal energy of the system. This did not fit with the old 
experimental fact that all bodies have the same acceleration in a gravita- 
tional field." (Einstein, 1977 pp. 135-136) 

The dependence of the vertical acceleration on the horizontal center-of-mass velocity 
is clearly expressed by d75b . However, Einstein's additional claim that there is also a 
similar dependence on the internal energy does not survive closer scrutiny. One might 
think at first that d75l also predicts that, e.g., the gravitational acceleration of a box 

A 'test-mass' should have vanishing electric charge, vanishing intrinsic angular momentum (spin), and 
vanishing higher (than zeroth) multipole moments of its mass distribution. 



36 



filled with gas molecules is less when heated up, due to the larger velocities of the gas 
molecules. But this arguments neglects the walls of the box which gain in stress due to 
the rising gas pressure, and according to d65l more stress means less weight. In fact, 
a general argument due to Laue (1911) shows that these effects precisely cancel (for 
detailed discussion, see Norton, 1993). 

We want to draw attention to the remarkable closing section of Einstein's part 
of the already mentioned "Entwurf" paper, written jointly with Grossmann, entitled: 
"Can gravity be described by a scalar?" (CPAE, Vol. 4, Doc. 13, p. 321-323)). Via an 
apparently simple Gedanken experiment Einstein implicitly claims to show that any 
scalar theory of gravity, in which the trace of the energy-momentum tensor acts as 
source, necessarily violates energy conservation. For the modern field theorist this is a 
surprising statement indeed, for any Poincare invariant theory has a conserved Noether 
charge connected with the symmetry of time-translations. This general argument was 
not available to Einstein (Noether's seminal paper only appeared in 1918), but it does 
show that Einstein's argument cannot be taken at face value. Closer inspection shows 
that in theories such as the one described by d73b . the conserved energy contains a 
contribution in which the local stresses within a body couple to the local gravitational 
potential. It seems that this contribution is not taken into account properly in Einstein's 
argument. More on this will appear elsewhere (Giulini, 2005c). 

In any case, arguments of various kinds seem to have triggered a conceptual phase 
transition in Einstein's thinking. He now adopted the strict equivalence principle rather 
than Lorentz invariance as his major guiding principle. During his time in Prague, this 
led him to consider non-linear modifications of d64b . such as the following: 



where <f> is now required to approach the value 1 rather than at large distances from 
the source. This equation may be derived from the requirement that the self-energy 
of the gravitational field acts as source on a par with the energy density of matter (see 
Giulini 1997). However, in Einstein's treatment the field <p is interpreted as a (spatially) 
variable velocity of light. This put him in opposition to contemporaries such as Gunnar 
Nordstrom, Gustav Mie, and Max Abraham who still searched for a special relativistic 
theory of gravity (though Abraham's theory also contained a variable speed of light). 

5.2 The Poincare invariant approach 

What theory of gravitation would have emerged from the attempts of Abraham, Nord- 
strom, and Mie? What would have happened if Einstein had left physics in, say, 1912? 
Would have GR never have come into being? 

We do not think so, but presumably it would have been discovered much later in 
a non-geometrical way that is often called the "flat field approach to gravitation". In 
1939 Fierz & Pauli discussed, as an example of their work on higher spin equations, 
the field equation for a free massless spin-2 field. These authors were well aware of 
the difficulties that arise when a spin-2 field is coupled to matter. After this initial step, 
the idea that GR can be formulated as a consistent, highly non-linear spin-2 theory in 
flat spacetime was repeatedly studied. The first published work in this direction seems 




(76) 
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to be that by Gupta (1954, 1957) and Kraichnan (1955, 1956). Fierz also seems to 
have been thinking about this idea early on, which much later led to the thesis work 
of Wyss (1965). Other early attempts were made by Thirring, who was advocating 
this approach with different emphasis in various talks and publications (e.g., Thirring, 
1961). Fortunately, Feynman's Caltech lectures on gravitation, which also emphasize 
the field theoretic approach, have become available in book form (Feynman 1995). 
Weinberg (1964a, 1964b) also tried to develop a quantum theory of a self-interacting 
spin-2 field on flat spacetime. (We now know that such theories are not renormaliz- 
able, and neither are their supersymmetric extensions.) The theme was taken up later 
by Deser (1970), Wald (1996), and others. Quite recently it was shown that one can- 
not have several, mutually interacting spin-2 fields (Boulanger etal. 2001). This is 
important for string theory, where one identifies gravity with a massless spin-2 field. 

As already discussed, the simplest possibility of gravitational theory in flat space- 
time is that of a scalar field. Since such theories predict no global light deflection, 36 
Einstein urged astronomers in 1913 to measure the light deflection during the solar 
eclipse the following year in the Crimea. Moreover, scalar theories predict a retrogres- 
sion of Mercury's perihelion, which in case of the theory described by d73b is 1/6 of 
the the size of the advance predicted by GR. 37 

A spin-1 theory is also not viable. Such a theory would essentially be given by 
Maxwell's equations, with one appropriate sign change in order to make like charges 
(masses) attract rather than repel one another. But this leads to a sign change in the 
expression for the field's energy, which then becomes unbounded from below, giving 
rise to potential instabilities. The perihelion advance it predicts is 1/6 the Einsteinian 
value, again in contrast to observation, within the parentheses One is thus led to a spin- 
2 theory. If one tries the simplest version by coupling a spin-2 field h^ v linearly to the 
energy-momentum tensor of matter, the resulting field equation is unphysical. Since 
the free spin-2 theory has a gauge symmetry, the field equation implies that d v T^ u = 0, 
which is unacceptable. For instance, the motion of a fluid would not be affected by the 
gravitational field in that case. Clearly, one has to include back-reactions on matter, 
which makes the theory non-linear. From the results in the works cited above it follows 
that there is only one consistent way of doing this. The gauge group of the linear theory 
has to be extended to the full diffeomorphism group, and the field equations become 
equivalent to Einstein's equations for a Lorentz metric determined by the spin-2 field 

At this point one can re-interpret the theory geometrically. Thereby the flat metric 
disappears completely and one arrives at GR (cf. Mittelstaedt, 1970, and references 
therein). In summary we can say this: The natural development of the theory shows 
that it is possible to eliminate the flat Minkowski metric, leading to a description in 
terms of a curved metric which has a direct physical meaning. The originally postu- 
lated Lorentz invariance turns out to be physically meaningless and plays no useful 
role. The flat Minkowski spacetime becomes a kind of unobservable ether. The con- 

36 See Ehlers & Rindler (1997) for a discussion of the difference between local and global light deflection. 

37 The same retrogression is also predicted by Nordstrom's "second" theory (Nordstrom 1913), whereas 
Nordstroms "first" theory (Nordstrom 1912) predicts twice that value (Roseveare 1982, p. 153). Of 
course, this only became a problem for these theories after 1915 when GR correctly predicted the 
perihelion advance (CPAE, Vol. 6, Doc. 24). The earlier "Entwurf" theory predicted an advance 5/12 
the size of the GR value (CPAE, Vol. 4, Doc. 14). 
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elusion is inevitable that spacetime is a Lorentzian manifold with a the metric that is a 
dynamical field subject to the Einstein field equations. 

6 Einstein's theory of spacetime and gravity 

6.1 General Remarks 

After some detours, which we cannot describe here, Einstein arrived at the final form 
of GR in November 1915. It is a geometric field-theory par excellence. No non- 
dynamical background structures exist, and its equations are invariant under the largest 
group possible: the group of spacetime diffeomorphisms. However, elements of this 
group do not play the role of symmetries, as the Lorentz transformations did in SR, 
but of gauge transformations. As stressed above, this means that any two field con- 
figurations connected by a diffeomorphism are empirically indistinguishable and thus 
physically identical. 

The fundamental field is a Lorentzian (pseudo Riemannian) metric g^ u on a four- 
dimensional manifold M, obeying a system of ten non-linear (but quasi-linear) differ- 
ential equations (G is again Newton's constant): 

G>„ = (8ttG/ c 4 ) T^y . (77) 

Here we adopted the signature convention 'mostly plus', i.e. (—,+,+,+), and ne- 
glected a possible cosmological term which we will introduce later. The 'Einstein 
Tensor', G^ u , is a second order differential expression in the metric components g^ 
and directly relates to its curvature. 38 T^ u is the stress-energy tensor of matter, which 
generally also involves g^. 

Given a solution g^, a spinless test particle moves on geodesies of that metric, 
which is therefore best compared to the gravitational potential. The idea of a grav- 
itational field is then played by the connection r£„, which appears in the geodesic 
equation: 

x x + T^± u = , (78) 

and which is determined by g^ and its first derivatives. The conceptual difference 
to other 'fields' is that the connection is not a tensor field on spacetime. This can be 
seen as a consequence of the equivalence principle, according to which T\ v vanishes 
locally in a freely falling frame. 

Another difference to other fields is that gravity is not a 'force' in the Newtonian 
sense. In Newtonian physics, a force is the cause for deviations from inertial motion. 
But d78t defines inertial motion and the 'gravitational field', T\ v , is a structural pre- 
requisite for such a definition. Again this is a consequence of unifying inertia and 
gravity. 

6.2 Some current theoretical problems of GR 

The Einstein field equations dTTt are at the core of GR and much research over the last 
50 years has gone into their mathematical analysis. One of the main issues has been 

38 Taking the trace of the Riemannian curvature Tensor R a in a(3 one gets the Ricci tensor 
Contracting the Ricci tensor with g M " one obtains the Ricci scalar R. The Einstein tensor is now 
defined by G M „ := i? M „ — \g^ v R. 
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whether the equations admit a well posed initial-value formulation (Cauchy Problem), 
as many physical questions are naturally addressed that way. This turned out to be 
the case, albeit in a slightly more complicated fashion due to general diffeomorphism 
invariance. Roughly speaking, four of the ten components of dTTt are mere restric- 
tions on the initial data, so-called "constraints", and the remaining six components are 
evolution equations. This means that given initial data for g^ which satisfy the con- 
straints, Einstein's equations leave undetermined the evolution of four out of the ten 
components of g^ v . However, this does not reflect any lack of physical predictability, 
but merely the existence of gauge redundancies corresponding to arbitrary point trans- 
formations, which account for the four arbitrary functions. Such a situation occurs in 
any gauge theory. A pedagogical outline is given, e.g., in (Giulini, 2003). 

One of the most prominent features of Einstein's equations is their non-linearity. 
This means that solutions evolving from regular initial data may develop singularities 
in a finite time. As a result of this, not much is known about the existence of (tempo- 
rally) global solutions. Given a singularity-free solution generated by some initial data 
(i.e., a particular spacetime), it is natural to ask whether sufficiently nearby (in a suit- 
able sense) data still evolve without the formation of singularities. 39 In this case the 
original solution is called stable. Instabilities are well-known from hydrodynamics, 
e.g., those due to the formation of shock waves. One may likewise expect gravitating 
systems to be generically unstable due to gravitational shock-waves and gravitational 
collapse. It may thus be considered a pleasant surprise that stability results have been 
obtained. Most importantly, in a veritable tour de force Christodoulou & Klainerman 
(1993) were able to prove the stability of Minkowski space. Earlier, Friedrich (1986) 
had already proven the stability of De Sitter space (a solution to the matter-free Ein- 
stein equations with positive cosmological constant). A few more, rather scattered 
stability results exist concerning other cosmological models. 

The formation of singularities is, to a certain extent, generic in GR (see, e.g., 
Hawking & Ellis 1973). Singularities might give rise to a true breakdown of pre- 
dictability if the singularity is not causally disconnected from the outside world (i.e., 
from observers not falling into the singularity) by the formation of an event horizon. 
The "cosmic censorship hypothesis" expresses the expectation that under certain rea- 
sonable conditions such a breakdown of predictability does not occur. This hypothesis 
is not yet proven. Part of the problem is that it is difficult to formalize. See the review 
by Clarke (1994) for a precise formulation and an account of what has been achieved so 
far. A lucid and less technical discussion of the fundamental concepts is given by Ear- 
man (1995). The notion of a singularity itself is already far from being straightforward 
(see Geroch, 1968). Often the existence of singularities is demonstrated indirectly 
through reductio ad absurdum arguments. But this does not give any insight into their 
formation and structure. 

Analytical problems concerning the large-scale behavior of gravitational fields are 
currently attracting a lot of interest (see, e.g., Chrusciel & Friedrich, 2004). Specific 
results on black-holes will briefly be reviewed in Sec. 16.51 

Other analytical problems with more direct relevance for experiments, such as the 
ongoing search for gravitational waves, concern the motion of compact bodies in the 

39 One says that the evolution has no singularities, if, technically speaking, the maximal Cauchy devel- 
opment is geodesically complete. 
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strong-field regime. Here one particularly wishes to understand the phases in which 
most of the gravitational radiation is generated. Good candidates for such generation 
processes are the close encounters and mergers of neutron stars and black holes. An- 
alytically, the case of black holes is simpler, for it can be described by the matter-free 
Einstein equations. But it still is a genuine field-theoretic problem, as point objects do 
not exist in GR (a feature that to some extent is mimicked by d76l : see Giulini, 2003). 
There is no known analytical solution to the two-body problem so that a combina- 
tion of refined analytical approximation schemes and numerical techniques becomes 
essential for evolving initial data. But what are the appropriate initial data for two 
black holes in close proximity, from which we can reliably calculate (numerically) the 
flux of gravitational waves produced in the merging process? There are two problems 
here: First, numerical methods are used to integrate certain field components in the 
near-zone, whereas the mathematical identification of gravitational radiation is done 
in the far zone (on "future null-infinity", should it exist). No unambiguous analytical 
procedure relating the former to the latter (i.e., in the form of a flux-theorem) has been 
given. Strictly speaking, however, this is exactly what is needed to relate the integra- 
tion in the near zone to an actual energy loss of the system. Secondly, standard data for 
two black holes, even the most simple ones describing two non-rotating holes momen- 
tarily at rest, seem to be filled with gravitational radiation up to spatial infinity already 
(Valiente-Kroon, 2003). Hence it seems that one either needs to distinguish between 
radiation already contained in the data and radiation produced by the merger — and it is 
hard to see how that could be done — or to modify data outside the two-hole region, so 
that it no longer contains radiation. It was only understood recently (Corvino, 2000) 
that initial data can, in fact, be modified locally, which is not obvious. 40 But so far this 
existence result has not been backed up by sufficiently concrete methods that could be 
employed to change initial data in a physically controllable way. 

For other aspects of current activities in mathematical relativity, see (Frauendiener 
etal, 2006). 

6.3 Some aspects of the current experimental situation 

During the last few years we have seen tremendous developments in experimental and 
observational gravity. These range from weak-field tests using planets or satellites 
in earthbound or solar-system orbits, to very impressive strong-field tests on Galactic 
binary-systems with compact objects, like neutron stars. We shall have more to say 
about the latter systems in Sec. 16.51 Here we make some comments on the classic 
weak-field tests. 

In Sec. l4.4l we roughly described how qualitative statements about a range of as- 
pects of a theory can be made by parameterizing possible deviations and extracting 
upper bounds for their values from observations. Various methods to do this in GR 
have been developed to a high degree of sophistication (see, e.g., Will 1993). One of 
them is the so-called "Parameterized Post-Newtonian" (PPN) formalism, where one 
considers finite-parameter families of metrics, g^ u , all of which are within a post- 
Newtonian approximation scheme. Each of the parameters may be thought of as prob- 

40 It is not obvious because the data have to keep satisfying the constraints, which form an underdeter- 
mined elliptic system of differential equations. Note that solutions to strictly elliptic systems generally 
cannot be modified locally, since they are uniquely determined by the boundary conditions. 
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ing a specific deviation from the prediction made by GR. Some correspond to so-called 
"preferred-frame" and "preferred-location" effects, others parameterize possible viola- 
tions of conservation of total momentum. If we discard all those, only two parameters 
(3 and 7 remain (the so-called "Eddington-Robertson parameters"). 
Consider the case of a static, spherically symmetric metric: 

ds 2 = goo (r ) c 2 dt 2 + g ab (r ) dx a dx b . (79) 

Its parameterized form contains 7 and /?, which measure spatial curvature and "non 
linearity", 41 respectively: 

goo{r) = - [1 - 2(m/r) + 2/?(m/r) 2 + 0([m/r] 3 )] , 

r on (80) 

gab(r) = [l + 2 7 (m/r)+0([m/r] 2 )] 5 ab . 

Here m = GM/c 2 , where M is the central mass, G is Newton's constant, and c is 
the velocity of light. The physical dimension of m is that of length. In GR (without 
cosmological constant), the unique static and spherically symmetric solution is the 
Schwarzschild solution, which corresponds to the values 7 = (3 = 1. 

Typical experiments testing the value of 7 involve the deflection of light's direction 
of travel. For (1801 the deflection angle comes out to be 

4-77? 

A0 = i(l + 7 ). — , (81) 

GR value 

where d is the impact parameter (distance of closest approach). Using Very Long 
Baseline Interferometry (VLBI ), one finds that deflections of various astronomical 
sources lead to the upper bound I7 — 1| < 2 • 10~ 4 . 

More accurate tests involve the delay in integrated time of propagation, when a 
signal travels between two sites at distances r\ and r 2 from the central body, with 
closest approach d to the body (the so-called Shapiro time delay). For a round trip the 
delay is given by: 

AT = |(1 + 7) • — ■ ln(4nr 2 /d 2 ) . (82) 



GR value 

What is directly observed is not individual delay times, but their variation in 
observation-time t, as the line of sight (and hence d) comes closer to the central body 
(the Sun). The best available data currently available were obtained with the Cassini 
spacecraft on its flight to Saturn in June- July 2002. Stable and coherent two-way radio 
signals were exchanged and fractional frequency shifts were observed. This led to the 
upper bound (Bertotti etal, 2003): 

| 7 - 1| < 4.4 • 10~ 5 . (83) 

A typical observable effect which is sensitive to the value of (3 is the "anomalous" 
advance, or shift, of the periastron. The advance A92 per revolution, calculated for the 

41 Being "non linear" depends on the coordinates used, i.e., is a gauge dependent statement. 
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metric d80l l. is given by 



A 95 = i(2 7 -/3 + 2).-^-^ y , (84) 

V v ' 

GR value 

where a is the radius of the semi-major axis and e is the eccentricity. Applied to solar 
system planets (in which case one speaks of the perihelion), this shift is most pro- 
nounced for the innermost planet, Mercury, and quite accurately known (correspond- 
ing to a localization of Mercury up to 300 meters using radar reflection techniques). 
In order to compare observations with d84l one has to take into account other effects 
contributing to the perihelion shift. Those originating from perturbations of other plan- 
ets have been known in the 19th century. It was Einstein's very first triumph with 
GR to show that it accounts precisely for the discrepancy (of about 8 percent) between 
the observed shift and the shift due to planetary perturbations. 

However, Einstein did not take into account a possible contribution from the Sun's 
quadrupole moment. Such a contribution would be significant, if the quadrupole mo- 
ment, which is measured by a dimensionless number J 2, were at the upper end of the 
interval considered plausible for our Sun, which is roughly the interval 10~ 7 < J2 < 
10~ 5 . Ironically, this stirred up considerable controversy during the 1960s and 70s, 
with one side arguing that the motion of Mercury's perihelion refuted rather than con- 
firmed GR! Essentially the problem in the traditional approach is that estimations of 
J2 are made on the basis of the relation between the Sun's oblateness and its surface 
angular velocity, a procedure that depends on one's model of the Sun. In particular, it 
involves assumptions about its interior state of differential rotation. Modern results all 
suggest a 'small' value of J2 of about 2 • 10~ 7 (e.g., Lydon & Sofia, 1996). This is 
confirmed by new methods that use normal modes of solar oscillations (helioseismol- 
ogy) in order to get information about the internal structure of the Sun (e.g., Roxburgh, 
2001). See also Pireaux etal. (2003) for a comprehensive discussion and many refer- 
ences. 

A small value for J2 implies that the contribution of the Sun's quadrupole moment 
to the perihelion shift is less than 10~ 3 times the relativistic effect. Combining this 
value with the results of radar observations on Mercury and with the upper bound d83l 
on 1 7 — 1|, one finds the following upper bound for the parameter (3: 

|/? — 1| < 3 - 10 3 . (85) 

Note that if the orbiting mass is not a test body, but comparable in mass to the central 
body, d84t still applies as long as m is now taken to be the sum, mi + 1712, of the two 
masses. In this form d84l is applied to binary-pulsar systems. 

Other modern tests are sensitive to the intrinsic angular momentum (spin) of the 
central body. In that case the static metric d79t needs to be generalized to a stationary 
one (for constant angular momentum), which differs from (179180b by an off-diagonal 
contribution. In the simplified version of the PPN formalism displayed here, this con- 
tribution takes the form 

flb,(r) = ^(l + 7)- 2( ^ a)a , (86) 

42 To these perturbations, Venus, Jupiter, and Earth make the largest contributions with 52, 29, and 17 
percent respectively. 
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where s = SG/c 3 , S being the (constant) spin vector. The physical dimension of s is 
that of length-squared. It is interesting to note that at this level of approximation and 
under the simplifications assumed here, no new PPN parameter enters. 43 

A technologically very audacious experiment recently completed and currently be- 
ing analysed is Gravity-Probe B. It consists of an earthbound satellite in a polar 44 orbit 
approximately 640 km above ground. The satellite contains four magnetically sus- 
pended gyroscopes. According to GR, the Earth's rotation should induce a precession 
of these gyroscopes (with respect to a Quasar background). 

The metric components d86l cause local inertial frames (here realized by drag-free 
suspended gyroscopes) to rotate with respect to asymptotic frames at large radii (here 
realized by the quasar background) at an angular frequency of (to leading order): 

-> . -i . . 3n(n • s) — s 

^gyro(x) = |( 7 + 1) • C • 1 ^ , (87) 

S v ' 

GR value 

where n = x/r. This is called the Lense-Thirring precession. In the case of Gravity 
Probe B, the predicted precession is about 10 _10 u;, where Co is the Earth's angular 
velocity. This amounts to a miniscule precession of 47 milli-arcseconds per year! The 
Gravity-Probe-B experiment is expected to verify this prediction of GR at the one- 
percent level. 45 Its conceptual importance is that it directly measures the gravitational 
effects caused by mass-currents, sometimes referred to as "dragging effects". That 
part of the gravitational field which is generated by mass-currents is often called the 
"gravitomagnetic field". It is needed for theoretical consistency, just as the magnetic 
field is needed in electrodynamics (for a lucid discussion, see Nordtvedt, 1988). 

Another dragging effect of spinning central bodies is the precession of orbital 
planes around it. This has been recently verified for the earthbound system of two 
LAGEOS satellites (designed for other purposes), though only at the 10% level (Ciu- 
folini & Pavlis, 2004). 

More accurate though indirect measurements of dragging effects exist for neutron- 
star binary systems. These are due to spin-orbit and (much more pronounced) orbit- 
orbit couplings. According to GR, a spinning companion gives a contribution to the 
periastron shift of 

o r-*i o 3n(n • s) — s 

^periastron (,% ) = *C • ^$7^ 7^2^ ' \°"/ 

Here a is the semi-major axis and e is the orbital eccentricity. This means that if spin 
and orbital angular momentum form an acute angle, the periastron shifts due to ( l88t 



43 This is because we excluded preferred-frame and preferred-location effects, as well as violations of 
total momentum conservation. Then the "electric" and "magnetic" parts of the linearized gravitational 
field are related by local Lorentz invariance. This is just as in electrodynamics, where the magnetic 
field produced by a moving charge can be obtained from the Coulomb field of a charge at rest by a 
Lorentz transformation. In particular, as in electrodynamics, the split between 'electric' and 'magnetic' 
parts of the gravitational field is observer-dependent. 

44 The orbit is chosen polar in order to avoid unwanted contributions to the measured effect from the 
Earth's quadrupole moment. 

45 Another prediction of GR is the geodetic (or de Sitter-) precession, which is a consequence of spatial 
curvature and hence directly sensitive to 7. In the present situation it is about 160 times larger than 
the Lense-Thirring precession. If completed successfully, Gravity-Probe-B should therefore measure 
7 with an accuracy of 3 ■ 10 -5 , thereby slightly improving on the accuracy reached by Cassini. 
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and due to d84l (where m now corresponds to the sum of masses) will be in opposite 
directions. For binary pulsars spin-orbit effects have been calculated by Damour & 
Schafer (1988). 

In a binary system with comparable masses, the two components also move with 
comparable velocities in the center-of-mass frame. In that case gravitomagnetic fields 
of both components, contribute to their mutual periastron advance (orbit-orbit cou- 
pling). This effect is generally much bigger than that due to spin. For example, the 
Hulse-Taylor pulsar shows a total periastron advance of 4.2° per year, which is the sum 
of about 10° per year from the gravitoelectric and about —6° per year resulting from 
dragging due to the orbital motion of each companion in the center-of-mass frame (see 
Nordtvedt, 1988). 

6.4 Early history of gauge and Kaluza-Klein theories 

The history of gauge theories begins with GR, which can be regarded as a non-Abelian 
gauge theory of a special type. To a large extent the other gauge theories gradually 
emerged, in a slow and complicated process, from GR. Their common geometrical 
structure — best expressed in terms of connections of fiber bundles — is now widely 
recognized. 

Weyl's papers on the gauge principle 

It all began with H. Weyl (1918) who made the first attempt to extend GR in order to 
describe gravitation and electromagnetism within a unifying geometrical framework. 
This brilliant proposal contains the germ of all mathematical aspects of non-Abelian 
gauge theory. The word 'gauge' (german: 'Eich') transformation appeared for the first 
time in a subsequent paper on this theory (Weyl 1919, p. 114; cf. CPAE 8, Doc. 661, 
note 5), but in the everyday meaning of change of length or change of calibration. 

Einstein admired Weyl's theory as "a coup of genius of the first rate" (CPAE, Vol. 8, 
Doc. 498), but immediately realized that it was physically untenable. After a long dis- 
cussion Weyl finally admitted that his attempt was a failure as a physical theory (for 
discussion, see Straumann, 1987.) It paved the way, however, for the correct under- 
standing of gauge invariance. After the advent of quantum theory, Weyl himself rein- 
terpreted his original theory in a magisterial paper (Weyl 1929). This reinterpretation 
had actually been suggested before by London (1927). Fock (1926), Klein (1926), 
and others arrived at the principle of gauge invariance in the framework of wave me- 
chanics along completely different lines. 46 It was Weyl, however, who emphasized the 
role of gauge invariance as a constructive principle from which electromagnetism can 
be derived. This point of view became very fruitful for our present understanding of 
fundamental interactions. 

Weyl's papers have repeatedly been discussed in detail (see O'Raifeartaigh & 
Straumann, 2000). Weyl's reinterpretation was connected to his incorporation of 
Dirac's theory into GR, an important contribution in and of itself. This in turn was 
related to Einstein's recent unified theory, which invoked a distant parallelism with 
torsion. Wigner (1929) and others had noticed a connection between this theory and 

46 For details see the survey by Jackson and Okun, 2001, which also discusses the 19th-century roots of 
gauge invariance. 
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the spin theory of the electron. Weyl did not care for this and wanted to dispense with 
teleparallelism. This he achieved with the help of local tetrads (Vierbeine), a tech- 
nique that had been used extensively before by Cartan. Preparing the ground with a 
general-relativistic formulation of spinor theory, Weyl begins the final section of his 
1929 paper with: 

"We come now to the critical part of the theory. In my opinion the origin 
and necessity for the electromagnetic fields is the following. The compo- 
nents ipi , tjj2 [of the two-component spinor field] are, in fact, not uniquely 
determined by the tetrad but only to the extent that they can still be multi- 
plied by an arbitrary "gauge factor" e lX . The transformation of the tp in- 
duced by a rotation of the tetrad is determined only up to such a factor. In 
SR one must regard this gauge factor as a constant because we have only 
a single point-independent tetrad. Not so in GR; every point has its own 
tetrad and hence its own arbitrary gauge-factor; because by the removal 
of the rigid connection between tetrads at different points the gauge-factor 
becomes an arbitrary function of position." (Weyl, 1968, Vol. Ill, Doc. 85, 
p. 263) 

In this way Weyl arrived at the gauge principle in its modern form. As he emphasized: 
"from the arbitrariness of the gauge factor in ip appears the necessity to introduce the 
electromagnetic potential "(Weyl, 1968, Vol. Ill, Doc. 85, p. 263). 

The early work of Kaluza and Klein 

Early in 1919 Einstein received a paper by Theodor Kaluza, a young mathematician 
(Privatdozent) and consummate linguist in Konigsberg. Inspired by the work of Weyl 
the year before, Kaluza proposed another geometrical unification of gravitation and 
electromagnetism by extending spacetime to a five-dimensional pseudo-Riemannian 
manifold. Einstein reacted very positively. On April 21, 1919 he wrote to Kaluza: 
"The idea of achieving [a unified theory] by means of a five-dimensional cylinder 
world never dawned on me (...). At first glance I like your idea enormously" (CPAE, 
Vol. 9, Doc. 26). A few weeks later he added: "the formal unity of your theory is 
startling" (CPAE, Vol. 9, Doc. 35). The fourth of the five letters of Einstein to Kaluza 
(CPAE, Vol. 9, Doc. 40), however, makes it understandable why Einstein, despite his 
initial enthusiasm, delayed the publication of Kaluza's work for almost two years. In 
this letter Einstein raised a serious objection. What worried Einstein was the appar- 
ently huge influence of the scalar field on the electron in the dimensional reduction 
of the five-dimensional geodesic equation. Einstein expressed his hope that Kaluza 
would find a way out. But Einstein's "serious difficulty" ("ernsthafte Schwierigkeit") 
remained, as Kaluza (1921) acknowledged in his published paper. 

A few years later, shortly after the discovery of the Schrodinger equation, Oskar 
Klein improved and extended Kaluza's treatment, and revealed an interesting geo- 
metrical interpretation of gauge transformations (Klein 1926a, 1926b). Applying the 
formalism of quantum mechanics to the five-dimensional geodesic, and assuming pe- 
riodicity in the extra dimension, he also suggested that "the atomicity of the electric 
charge may be interpreted as a quantum law" (Klein 1926b, p. 516). The extension of 
the extra dimension turned out to be comparable to the Planck length. As Klein writes: 
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"The small value of this length together with the periodicity of the fifth 
dimension may perhaps be taken as a support of the theory of Kaluza in the 
sense that they may explain the non-appearance of the fifth dimension in 
ordinary experiments as the result of averaging over the fifth dimension." 
(Klein, 1926b, p. 516) 

For further discussion of this early work on higher-dimensional unification, see, e.g., 
O'Raifeartaigh & Straumann (2000). 

GR also played a crucial role in Pauli's discovery of non-Abelian gauge theories. 
(See Pauli's letters to Pais and Yang in Pauli 1985-99, Vol. 4). He arrived at all basic 
equations through dimensional reduction of a generalization of Kaluza- Klein theory, in 
which the internal space becomes a two-sphere. (For a description in modern language, 
see O'Raifeartaigh and Straumann 2000). 

In contrast, in the work of Yang and Mills (1954) GR played no role. In an inter- 
view in 1991 Yang recalled: 

"It happened that one semester [around 1970] I was teaching GR, and I 
noticed that the formula in gauge theory for the field strength and the for- 
mula in Riemannian geometry for the Riemann tensor are not just similar 
- they are, in fact, the same if one makes the right identification of sym- 
bols! It is hard to describe the thrill I felt at understanding this point." 
(Zhang, 1993, p. 17) 

The developments after 1958 consisted in the gradual recognition that — contrary to 
phenomenological appearances — Yang-Mills gauge theory could describe weak and 
strong interactions. Since this history is recounted in numerous textbooks, there is no 
need for us to dwell on it. 

6.5 Relativistic astrophysics 

By 1915 it was known through the work of W. Adams on the binary system of Sir- 
ius that SiriusB has an enormous average density of about 10 6 g/cm 3 . The existence 
of such compact stars constituted one of the major puzzles of astrophysics until the 
quantum statistical theory of the electron gas was worked out. On August 26, 1926, 
a paper by Dirac (1926) containing the Fermi-Dirac distribution was communicated 
to the Royal Society by R.H. Fowler. On November 3 of the same year, Fowler pre- 
sented his own work to the Royal Society (Fowler, 1926a), in which he systematically 
worked out the quantum statistics of identical particles and, in the process, developed 
the well-known Darwin-Fowler method. Shortly thereafter, on December 10, Fowler 
(1926b) communicated the Royal Astronomical Society a new paper with the title 
"Dense Matter". In this work he showed that the electron gas in Sirius B is almost 
completely degenerate in the sense of the new Fermi-Dirac statistics, realizing that 
"the black-dwarf is best likened to a single gigantic molecule in its lowest quantum 
state" (Fowler 1926b, p. 122), and he developed the non-relativistic theory of white 
dwarfs. The Fowler theory of white dwarfs is equivalent to the Thomas-Fermi theory, 
in which a white dwarf is considered as a big "atom" with about 10 57 electrons. For 
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white dwarfs the (semi-classical) Thomas-Fermi approximation is perfectly justified. 

It is remarkable that the quantum statistics of identical particles, satisfying Pauli's 
exclusion principle, found their first application in astrophysics. We recall that this 
principle implies that a sufficiently dense gas of such particles builds up a "zero-point" 
or "Fermi" pressure, depending only on its density and not its temperature. If the 
Fermi pressure dominates the pressure of the gas one calls it "degenerate". In the 
"non-relativistic" regime, where the kinetic energy of each particle is proportional to 
the square of its momentum, the Fermi pressure is proportional to the density of the 
gas raised to the power of 5/3. However, in the "ultrarelativistic" regime, the kinetic 
energy becomes directly proportional to the modulus of the momentum, as seen from 
d34l . As a result, the Fermi pressure turns proportional to the density raised to the 
power of only 4/3, a distinctly slower increase. In this sense SR has a destabilizing 
effect, which leads to a finite limiting mass for white dwarfs, given roughly by the 
ratio MpJm 2 N . (Here Mpi denotes the Planck mass and the nucleon mass.) The 
existence of such a limiting mass is thus an immediate consequence of SR and the 
Pauli principle. All this was recognized independently by several people (I. Frenkel, 
E. Stoner, S. Chandrasekhar and L.D. Landau) soon after the initial step was taken by 
Fowler. 

In 1934, Chandrasekhar derived the exact relation between mass and radius for 
completely degenerate configurations. He concluded his paper with the following 
statement: 

"The life-history of a star of small mass must be essentially different from 
the life-history of a star of large mass. For a star of small mass, the nat- 
ural white-dwarf stage is an initial step towards complete extinction. A 
star of large mass cannot pass into the white-dwarf stage and one is left 
speculating on other possibilities." (Chandrasekhar, 1934, p. 77) 

The delayed acceptance of the discovery by the 19-year-old Chandrasekhar, that quan- 
tum theory plus SR imply the existence of a limiting mass for white dwarfs is one 
of the more bizarre stories of the history of astrophysics. The following reaction of 
Landau is particularly astonishing: 

"For M > 1.5M Q there exists in the whole quantum theory no cause pre- 
venting the system from collapsing to a point. As in reality such masses 
exist quietly as stars and do not show any such ridiculous tendencies, we 
must conclude that all stars heavier than 1.5M Q certainly possess regions 
in which the laws of quantum mechanics (and therefore of quantum statis- 
tics) are violated." (Israel, 1987, p. 215) 

Still reeling from the quantum revolution a few years earlier, some physicists already 
expected a new revolution in the domain of relativistic quantum theory. 

It is worth mentioning that Lieb and Yau (1987) have shown that Chandrasekhar's 
theory can be obtained as a limit of a quantum-mechanical description in terms of a 
semi-relativistic Hamiltonian. 

47 The paper by Thomas (1926) was presented at the Cambridge Philosophical Society on November 6, 
1926. (Fermi's work was independent, but about one year later.) Fowler communicated his important 
paper on the non-relativistic theory of white dwarfs about one month later. One wonders who first 
noticed the close connection of the two approaches. 
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Soon after the discovery of the neutron, Baade and Zwicky, in a remarkable pair of 
papers (Baade & Zwicky, 1934a, 1934b), developed the idea of a neutron star and made 
the prescient suggestion that such stars would be formed in supernova explosions: 

"With all reserve we advance the view that supernovae represent the tran- 
sitions of an ordinary star into a neutron star, consisting mainly of neu- 
trons. Such a star may possess a very small radius and an extremely high 
density." (Baade and Zwicky, 1934b, p. 263) 

The first calculations for models of neutron stars in GR were performed by Oppen- 
heimer & G. Volkoff (1939). In their pioneering work, they used the equation of state 
of a completely degenerate ideal neutron gas. In those early days the effects of strong 
interactions could not be estimated. Theoretical interest in neutron stars soon dwin- 
dled, since no relevant observations existed. For two decades, Zwicky was one of 
the few who took seriously the probable role of neutron stars as final states of mas- 
sive stars. Interest in the subject was reawakened in the late 1950s and early 1960s. 
When pulsars were discovered in 1967, especially when a pulsar with a short period of 
0.033 s was found in the Crab Nebula, it became clear that neutron stars can be formed 
in type II supernova events through the collapse of the stellar core to nuclear densities. 
Since then the physics and astronomy of neutron stars has become one of the major 
fields of relativistic astrophysics. 

Systems in close proximity containing two neutron stars (binary and double pul- 
sars) have led to the most remarkable tests of GR. One of them is the celebrated Hulse- 
Taylor pulsar PSR 1913+16 that we have already mentioned, which gave rise to the first 
indirect evidence of gravitational waves. The measured long-term decrease of its or- 
bital period agrees perfectly with the energy loss due to the radiation of gravitational 
waves predicted by GR (see Fig.0. 

Another very interesting system is J0737-3039, which in October 2003 was shown 
to consist of two pulsars with pulse periods of about 23 milliseconds and 2.7 seconds, 
respectively, and an orbital period of 2.4 hours, and with an extremely high periastron 
advance of almost 17 degrees per year. This is about four times larger than that of the 
Hulse-Taylor pulsar. Fortuitously, the system's orbital orientation relative to our line 
of sight is almost exactly edge-on, which means that measurements of Shapiro time- 
delay of pulse periods of one component in the gravitational potential of the other can 
be performed with high precision. Presently the measurements of Shapiro delay verify 
GR predictions at the 0.1% level (Kramer etal. 2005). More results on this exciting 
system are expected in the near future. 

The theory of black holes belongs to the most beautiful applications of GR. The 
structure of stationary black holes was completely clarified during a relatively short 
period of time. When matter disappears behind a horizon, an exterior observer sees al- 
most nothing of its properties. One can no longer say, for example, how many baryons 
formed the black hole. A huge amount of information thus seems lost. The mass and 
angular momentum completely determine the external field, which is known analyti- 
cally (Kerr solution). This led Wheeler to say that "a black hole has no hair" (Wheeler, 
1971, p. 191-192). The preceding statement is now known as the no-hair-theorem. 

The proof of this theorem is an outstanding contribution to mathematical physics, 
and was completed in the span of only a few years by various authors (Israel, Carter, 
Hawking, and Robinson). A decisive first step was taken by Israel (1967), who was 
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Figure 1 : Cumulative shift of periastron time for the Hulse-Taylor pulsar according to obser- 
vation (dots) and theory (solid line). The theoretical prediction takes into account the energy 
loss due to the emission of gravitational radiation. 

able to show that a static black-hole solution of Einstein's vacuum equation has to 
be spherically symmetric and, therefore, agree with the Schwarzschild solution. In a 
second paper Israel extended this result to black-hole solutions of the coupled Einstein- 
Maxwell system. The Reissner-Nordstrom 2-parameter family, it turned out, exhausts 
the class of static so-called electrovac black holes. It was then conjectured by Israel, 
Penrose and Wheeler that in the stationary case the electrovac black holes should all 
be given by the 3-parameter Kerr-Newman family. After a number of steps, supplied 
by various authors, this conjecture could finally be proven. See (Heusler, 1996) for a 
comprehensive account of black-hole uniqueness results. 

The evidence for black holes in some X-ray binary systems and for super-massive 
black holes in galactic centers is still indirect, but has become overwhelming during 
the past few years. However, there is little evidence so far that these collapsed objects 
are described by the Kerr metric. 

Until a few years ago the best one could say about the evidence for super-massive 
black holes in the center of some galaxies, was that it was compelling if dynamical 
studies and observations of active galactic nuclei were taken together. In the meantime 
the situation has improved radically. The beautiful work of Genzel and his coworkers 
has established a dark-mass concentration of about 3 x 10 6 M© near the center of the 
Milky Way with an extension of less than 17 light hours (see, e.g., Ott etal, 2003). If 
this were a cluster of low mass stars or neutron stars, its central density would exceed 
10 17 Mq/pc 3 and would not survive for more than a few 10 5 years. The least exotic in- 
terpretation of this enormous dark mass concentration is that it is a black hole. But this 
is not the only possibility. Although alternative interpretations are highly implausible, 
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they illustrate the point that dynamical studies alone cannot give an incontrovertible 
proof of the existence of black holes. Ideally, one would like to show that some black 
hole candidate actually has an event horizon. There have been various attempts in this 
direction, but presumably only gravitational wave astronomy will reveal the essential 
properties of black holes. 

Most astrophysicists do not worry about possible remaining doubts. The evidence 
for (super-massive) black holes has become so overwhelming that the burden of proof 
is now on the hard-core skeptics. 

6.6 Relativistic cosmology 

In 1917 Einstein applied GR for the first time to cosmology, and found the first cos- 
mological solution of a consistent theory of gravity (CPAE, Vol. 6, Doc. 43). In spite 
of its drawbacks this bold step can be regarded as the beginning of modern cosmology. 
It is still interesting to read this paper about which Einstein says: "I shall conduct the 
reader over the road that I have myself travelled, rather a rough and winding road, 
because otherwise I cannot hope that he will take much interest in the result at the end 
of the journey." In a letter to Ehrenfest of February 4, 1917 (CPAE, Vol. 8, Doc. 294), 
Einstein wrote about his attempt: "I have again perpetrated something relating to the 
theory of gravitation that might endanger me of being committed to a madhouse." 48 

In his attempt Einstein assumed — and this was completely novel — that space is 
globally closed. This was because he believed at the time that this was the only way to 
satisfy what he later (CPAE, Vol. 7, Doc. 4) named Mach's principle, the requirement 
that the metric field be determined uniquely by the energy-momentum tensor. In these 
early years, and for quite some time, Mach's ideas on the origin of inertia played an 
important role in Einstein's thinking (for a discussion, see, e.g., Janssen, 2005). This 
may even be the primary reason that he turned to cosmology so soon after the comple- 
tion of GR. Einstein was convinced that isolated masses cannot impose a structure on 
space at infinity. His intention was to eliminate all vestiges of absolute space. It is for 
such reasons that he postulated a universe that is spatially finite and closed, a universe 
in which no boundary conditions are needed. Einstein was already thinking about the 
problem regarding the choice of boundary conditions at infinity in spring 1916. In a 
letter to Michele Besso from May 14, 1916 (CPAE, Vol. 8, Doc. 219) he mentions the 
possibility of the world being finite. A few month later he developed these ideas in 
correspondence with Willem de Sitter. 

Einstein assumed that the Universe was not only closed but also static. This was 
not unreasonable at the time, because the relative velocities of the stars as observed 
were small. 49 

These two assumptions, however, were incompatible with Einstein's original field 
equations. For this reason, Einstein added the famous A-term, which is compatible 
with the principles of GR, in particular with the energy-momentum law V V T^ V = 

48 "Ich habe wieder etwas verbrochen in der Gravitationstheorie, was mich ein wenig in Gefahr bringt, in 
ein Tollhaus interniert zu werden." 

49 Recall that astronomers only learned later that spiral nebulae are independent star systems outside the 
Milky Way. This was definitively established when Hubble found in 1924 that there were Cepheid 
variables in Andromeda as well as in other galaxies. Five years later he announced the recession of 
galaxies. 
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for matter. The modified field equations are (compare dTTl ) 

G, w = (SnG/c^T^ - Ag^ . (89) 

The cosmological term is, in four dimensions, the only possible addition to the field 
equations if no higher than second order derivatives of the metric are allowed (Love- 
lock's theorem; see Lovelock (1971)). This remarkable uniqueness is one of the most 
attractive features of general relativity. (In higher dimensions additional terms satisfy- 
ing this requirement are allowed.) 

For the static Einstein universe the field equations imply the two relations 

(4vrG/ C 2 )p = ^=A, (90) 

where p is the mass density of the dust filled universe (zero pressure) and a is the 
radius of curvature. (In passing we remark that the Einstein universe is the only static 
dust solution; one does not have to assume isotropy or homogeneity. Its instability 
was demonstrated by Lemaitre in 1927.) Einstein was very pleased by this direct 
connection between the mass density and geometry, because he thought that this was 
in accord with Mach's philosophy. 

Einstein concludes with the following sentences: 

"In order to arrive at this consistent view, we admittedly had to introduce 
an extension of the field equations of gravitation which is not justified 
by our actual knowledge of gravitation. It has to be emphasized, how- 
ever, that a positive curvature of space is given by our results, even if the 
supplementary term is not introduced. That term is necessary only for 
the purpose of making possible a quasi-static distribution of matter, as re- 
quired by the fact of the small velocities of the stars." (CPAE, Vol. 6, Doc, 
43, p. 551) 

In a letter to De Sitter of March 12, 1917 (CPAE, Vol. 8, Doc. 311), Einstein em- 
phasized that his model was intended primarily to settle the question "whether the 
basic idea of relativity can be followed through its completion, or whether it leads to 
contradictions". He added that is was an entirely different matter whether the model 
corresponds to reality. 

Only later did Einstein come to realize that Mach's philosophy is predicated on 
an antiquated ontology that seeks to reduce the metric field to an epiphenomenon of 
matter. It became increasingly clear to him that the metric field has an independent ex- 
istence (corresponding to physical degrees of freedom), and his enthusiasm for Mach's 
principle gradually evaporated. In a letter to Pirani he wrote in 1954: "As a matter of 
fact, one should no longer speak of Mach's principle at all." (Pais, 1982, Sec. 15e). 50 
The absolute existence of the spacetime continuum, independent of any matter, is a 
remnant in GR of Newton's absolute space and time. For a modern and comprehensive 
discussion of various aspects of Mach's principle and their status in GR, see (Barbour 
& Pfister, 1995). 

50 "Von dem Machschen Prinzip sollte man eigentlich iiberhaupt nicht mehr sprechen." 
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From static to expanding world models 

It must have come as quite a shock to Einstein, that within days of receiving a letter 
in which Einstein described his cosmological model (CPAE, Vol.8, Doc. 311), De 
Sitter had found a completely different cosmological model — also allowed by the new 
field equations with cosmological term — that was anti-Machian in that it contained 
no matter whatsoever (CPAE, Vol.8, Doc. 312; De Sitter 1917a). For this reason, 
Einstein tried to discard it on various grounds (more on this below). Einstein and De 
Sitter mostly discussed De Sitter's solution in its so-called static form (CPAE, Vol. 8, 
Doc. 355; De Sitter 1917b): 



The spatial metric is that of a three-sphere of radius R, determined by A = 3/R 2 . The 
model had one very interesting property: For light sources moving along static world 
lines there is a gravitational redshift, which became known as the De Sitter effect. 
This was thought to have some bearing on the redshift results obtained by Slipher. 
Because the fundamental (static) worldlines in this model are not geodesies, a freely- 
falling particle released by any static observer accelerates away from such an observer, 
generating local velocity (Doppler) redshifts corresponding to peculiar velocities. In 
his famous book, "The Mathematical Theory of Relativity", Eddington wrote about 
this: 

"De Sitter's theory gives a double explanation for this motion of reces- 
sion; first, there is the general tendency to scatter (...); second, there is a 
general displacement of spectral lines to the red in distant objects owing to 
the slowing down of atomic vibrations (...), which would be erroneously 
interpreted as a motion of recession." (Eddington, 1924, p. 161) 

We do not want to enter into all the confusion over the De Sitter universe (see, e.g., 
Vol. 8, pp. 351-357, the editorial note, "The Einstein-De Sitter- Weyl-Klein Debate"). 
One source of confusion was the apparent singularity at r = R = (3/A) 1 / 2 . This was 
thoroughly misunderstood at first even by Einstein and Weyl. In the end, Einstein had 
to acknowledge that De Sitter's solution is fully regular and matter-free and thus indeed 
a counterexample to Mach's principle. But he still discarded the solution as physically 
irrelevant because it is not globally static. This is clearly expressed in a letter from 
Weyl to Klein, dated February 7, 1919 (quoted in CPAE, Vol. 8, Doc. 567, note 3), 
after Weyl had discussed the issue during a visit of Einstein to Zurich. An important 
discussion of the redshift of galaxies in De Sitter's model by Weyl in 1923 should 
be mentioned. Weyl (1923) introduced an expanding version of the De Sitter model. 
For small distances his result reduced to what later became known as the Hubble law. 51 
Independently of Weyl, Lanczos (1922) also introduced a non-stationary interpretation 
of De Sitter's solution in the form of a Friedmann spacetime with positive spatial 
curvature. In a subsequent paper he also derived the redshift for the non-stationary 
interpretation (Lanczos 1923). 

51 We recall that the de Sitter model has many different interpretations, depending on the class of funda- 
mental observers that is singled out. This point was first stressed by Lanczos (1922). 
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Until about 1930 almost everybody 'knew' that the universe was static, notwith- 
standing two fundamental papers by Friedmann (1922, 1924) and independent work 
by Lemaitre (1927). 52 These path-breaking papers were largely ignored. The history 
of this early period has — as is often the case — been distorted by some widely read 
documents. Einstein too accepted the idea of an expanding universe only much later. 
After Friedmann's first paper, he published a brief note claiming to have found an er- 
ror in Friedmann's work; when it was pointed out to him that it was his error, Einstein 
published a retraction of his comment, with a sentence that (fortunately for him) was 
deleted before publication: "[Friedmann's paper] while mathematically correct is of 
no physical significance" (Stachel 2002, p. 469). In comments to Lemaitre during the 
Solvay meeting in 1927, Einstein again rejected the expanding universe solutions as 
physically unacceptable. According to Lemaitre, Einstein told him: "Vos calculs sont 
corrects, mais votre physique est abominable" (Schucking, 1993). On the other hand, 
we found in the archive of the ETH many years ago a postcard of Einstein to Weyl from 
1923, related to Weyl's reinterpretation of De Sitter's solution, with the following in- 
teresting sentence: "If there is no quasi-static world, then away with the cosmological 
term." This goes to show once again that history is not as simple as it is often being 
portrayed. 

It is also not well-known that Hubble interpreted his famous results on the redshift 
of the radiation emitted by distant 'nebulae' in the framework of the De Sitter model. 
The general attitude is well illustrated by the following remark of Eddington at a Royal 
Society meeting in January 1930: "One puzzling question is why there should be only 
two solutions. I suppose the trouble is that people look for static solutions" (Eddington, 
1930, p. 850). Lemaitre, who had been for a short time a post-doctoral student of 
Eddington's, read this remark in a report to the meeting published in Observatory, and 
wrote to Eddington alerting him to his 1927 paper. Eddington had seen that paper, 
but had completely forgotten about it. But now he was greatly impressed and praised 
Lemaitre's work in a letter to Nature. He also arranged for a translation which appeared 
in Monthly Notices of the Royal Astronomical Society (Lemaitre 1931). 

Lemaitre's successful explanation of Hubble's discovery finally changed the view- 
point of the majority of workers in the field. At this point Einstein (1931) rejected 
the cosmological term as superfluous and no longer justified. At the end of the pa- 
per he made some remarks about the age problem which was quite severe without the 
A-term, since Hubble's value for the Hubble parameter at the time was about seven 
times too large. Einstein, however, was not too worried and suggested two ways out. 
First, he pointed out that the matter distribution is in reality inhomogeneous and that 
the approximate treatment may be illusionary. Secondly, he cautioned against large 
extrapolations in time in astronomy. 

Einstein repeated his new viewpoint much later (Einstein 1945), and it was adopted 
by many other influential workers, e.g., by Pauli (1958, supplementary note 19). 
Whether Einstein really considered the introduction of the A-term as "the biggest blun- 
der of his life" as related by Gamov (1970, p. 44) appears doubtful to us. In his pub- 
lished work and extant letters such a strong statement is nowhere to be found. Einstein 
discarded the cosmological term merely for reasons of simplicity. For a minority of 

52 For discussion of Lemaitre's work, see Eisenstaedt, 1993. 
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cosmologists this was not sufficient reason. Paraphrasing Rabi 53 one could ask: 'who 
ordered it away'? 

Einstein's paper (1931) was his last one on cosmology. In hindsight it is somewhat 
puzzling that after he saw the first paper by Friedmann he did not realize that his own 
static model was unstable. 

Vacuum energy and gravity 

So much for the classical discussion of the A-term; we do, however, want to add a few 
remarks the A-problem in the context of quantum theory, where the problem becomes 
very serious indeed. Since quantum physicists were facing so many other problems, 
it need not surprise us that in the early years they did not worry about this particular 
one. An exception was Pauli, who wondered in the early 1920s whether the zero-point 
energy of the radiation field could have significant gravitational effects. He estimated 
the influence of the zero-point energy of the electromagnetic radiation field — cut off at 
the classical electron radius — on the radius of the universe, and came to the conclusion 
that the "could not even reach to the moon" (for more on this, see Straumann 2003a). 
Pauli 's only published remark on his considerations can be found in his Handbuch 
article on quantum mechanics, in the section on the quantization of the radiation field, 
where he says: "Also, as is obvious from experience, the [zero-point energy] does not 
produce any gravitational field" (Pauli, 1933, p. 250). 

For decades nobody else seems to have worried about contributions of quantum 
fluctuations to the cosmological constant, although physicists learned after Dirac's hole 
theory that the vacuum state in quantum field theory is not empty but has interesting 
physical properties. As far as we know, the first one to come back to possible con- 
tributions of the vacuum energy density to the cosmological constant was Zel'dovich. 
He discussed this issue in two papers (Zel'dovich, 1967, 1968) during the third re- 
naissance period of the A-term, but before the advent of spontaneously-broken gauge 
theories. He pointed out that, even if one assumes in a completely ad-hoc fashion that 
the zero-point contributions to the vacuum energy density are exactly cancelled by a 
bare term, there still remain higher-order effects. In particular, gravitational interac- 
tions between the particles in the vacuum fluctuations are expected on dimensional 
grounds to lead to a gravitational self-energy density of order G/i 6 , where fi is some 
cut-off scale. Even for \i as low as 1 GeV, this is about 9 orders of magnitude larger 
than the observational bound. 

This strongly suggests that there is something profound that we do not seem to 
understand at all, certainly not in quantum field theory (nor, at least so far, in string 
theory). We are unable to calculate the vacuum energy density in quantum field theo- 
ries, such as the Standard Model of particle physics. But we can attempt to make what 
appear to be reasonable order-of-magnitude estimates for the various contributions. 
All expectations are in dramatic conflict with the facts (see, e.g., Straumann, 2003b). 
Trying to arrange the cosmological constant to be zero is unnatural in a technical sense. 
It is like enforcing a particle to be massless, by fine-tuning the parameters of the theory 
when there is no symmetry principle implying a vanishing mass. The vacuum energy 
density is unprotected from large quantum corrections. This problem is particularly 

53 When Rabi heard at a conference the first time that the muon had been discovered, his reaction was: 
"who has ordered it?" (see, e.g., Feynman, 1985, p. 165). 
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severe in field theories with spontaneous symmetry breaking. In such models there are 
usually several possible vacuum states with different energy densities. Furthermore, 
the energy density is determined by what is called the effective potential, and this is a 
dynamical object. Nobody can see any reason why the vacuum of the Standard Model 
we ended up as the universe cooled, has — by the standards of particle physics — an 
almost vanishing energy density. Most likely, we shall only find a satisfactory answer 
once we have a theory that successfully combines the concepts and laws of GR with 
those of quantum theory. 

For a number of years now, cosmology has been going through a fruitful and excit- 
ing period. Some of the developments are clearly of general interest, well beyond the 
fields of astrophysics and cosmology. Lack of space prevents us from even indicating 
the most important issues of current interest. 
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